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Abstract — One of the design objectives in distributed storage 
system is the minimization of the data traffic during the repair 
of failed storage nodes. By repairing multiple failures simulta- 
neously and cooperatively, further reduction of repair traffic is 
made possible. We give a closed-form expression of the optimal 
tradeoff between the repair traffic and the amount of storage 
in each node for cooperative repair. We also show the existence 
of cooperative regenerating codes matching the points on the 
fundamental tradeoff curve. Specially, we show the existence of 
linear cooperative regenerating codes for functional repair, with 
an explicit bound on the required finite field size. Two families 
of explicit constructions are given. 

Index Terms — Distributed Storage System, Regenerating 
Codes, Decentralized Erasure Codes, Network Coding, Submod- 
ular Flow. 



I. Introduction 

In order to provide high data reliability, distributed storage 
systems disperse data with redundancy to multiple storage 
nodes against node failures. There are two common redun- 
dancy schemes, replication employed by the current Google 
file system [l_\ and erasure coding used in Oceanstore [2] and 
TotalRecall |I3|. Although repUcation-based scheme is easy to 
manage, it has much lower storage efficiency than erasure 
coding [4J. With {n,k) maximal-distance separable codes, a 
data file is encoded and distributed to n storage nodes, any k 
of which can reconstruct the original file. The data file remains 
intact even though some storage nodes may fail. 

In case of node failures, we need to regenerate new nodes 
(called the newcomers) to replace the failed nodes. The 
newcomers are regenerated by downloading some data from 
the surviving nodes. The required traffic for repairing one 
single failed node, called repair-bandwidth, is another metric 
in measuring the system performance, which is essential in 
bandwidth-limited storage networks. In the pioneering work of 
Dimakis et al. [5 1, a class of erasure codes, called regenerating 
codes, is introduced to reduce the repair-bandwidth. 

The repair based on regenerating codes can be roughly 
be divided into two classes. In the first class, called exact 
repair, the content of the newcomers are exactly the same as 
the content of the failed nodes. The second class is called 
functional repair. With functional repair, the content of the 
newcomers are not necessarily identical to the failed nodes, but 
the property that a data collector connecting to any subset of k 
nodes is able to decode the data file is preserved. On the one 
hand, in [5 |, the functional repair problem of minimizing the 

K. W. Shum and Y. Hu ai'e with Institute of Network Coding, 
the Chinese University of Hong Kong, Shatin, Hong Kong. Email: 
{wkshum,ychu} @inc. cuhk.edu. hk. 

This work was partially supported by a grant from the University Grants 
Committee of the Hong Kong Special Administrative Region, China (Project 
No. AoE/E-02/08). 



repair-bandwidth is equivalent to a single-source multi-casting 
problem in network coding theory [61, L7J, and a optimal 
tradeoff between the repair-bandwidth and the amount of data 
stored in each node is derived. Some studies on constructions 
of regenerating codes can be found in |[8l-|[T9 |. 

Most of the studies on regenerating codes in the literature 
focus on single-failure recovery for each repair. However, 
in large-scale distributed storage systems, a multiple-failure 
repair is the norm rather than the exception. For instance, in 
some practical systems such as TotalRecall |3|, a recovery 
process is triggered only after the total number of failed nodes 
has reached a predefined threshold. In peer-to-peer storage 
systems with high churn rate, nodes may join and leave the 
system in batch. This can also be regarded as multiple node 
failures. 

In this paper, we address the problem of repairing multiple 
node failures simultaneously, with the feature of data exchange 
among the newcomers enabled. This is called cooperative 
repair or collaborative repair, and is first introduced in Ii20i . 
We will call a regenerating code with cooperative repair 
a cooperative regenerating code. It is shown that repair- 
bandwidth can be further reduced if the node regeneration 
is performed jointly, instead of separately. In ll2Ti . a special 
class of cooperative regenerating codes is proposed, in which 
the newcomers can select survival nodes for repairing flexibly. 
In II20I . II2TI . only the special case of minimum storage per 
node is considered. Studies of cooperative regenerating codes 
for minimum storage or minimum repair-bandwidth can be 
found in [22 1 and |23|. 

Using the information flow graph for cooperative repair, 
we derive the fundamental tradeoff between the storage per 
node and the repair-bandwidth. However, the size of the 
information flow graph is unbounded; there are infinitely many 
data collectors and arbitrary number of node failures. Existing 
algorithms for network code construction, such as the Jaggi- 
Sander et al.'s algorithm 1241 . assume that the graph is finite, 
and requires that the finite field size grows as the number of 
sink nodes increases. We therefore cannot apply Jaggi-Sander 
et al. 's algorithm directly to construct cooperative regenerating 
codes, unless we truncate the infinite information flow graph 
to a finite subgraph. This impose a maximum on the number of 
repairs and it is not obvious whether there exists cooperative 
regenerating code that can support arbitrary number of repairs 
without re-starting the system. Nevertheless, in the single-loss 
case, Wu in f25| succeeded in showing, by exploiting the 
structure of the information flow graph, that we can work over 
a fixed finite field and sustain the distributed storage system 
for indefinitely many repairs. In this paper, we generalize the 
results in |25 | to cooperative repair. 
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Fig. 1. Repairing a single node failure. 



A. An Example of Cooperative Repair 

We examine the following example taken from ll26l . Four 
native data packets A\, A2, Bi and B2 are distributed to four 
storage nodes. Each of four storage nodes stores two packets. 
The first one stores Ai and A2, the second stores Bi and B2- 
The third node contains two packets Ai + Bi and 2A2 + B2, 
and the last node contains 2Ai + Bi and A2 + B2- Here, 
a packet is interpreted as an element in a finite field, and 
addition and multiplication are finite field operations. We can 
take GF{5), the finite field of five elements, as the underlying 
finite field in this example. It can be readily checked that any 
data collector connecting to any two storage nodes can decode 
the four original packets. 

Suppose that the first node fails. The naive method to 
repair the first node is to first reconstruct the four packets by 
connecting to any other two nodes, from which we can recover 
the two required packets Ai and ^2- Four packet transmissions 
are required in the naive method. The repair-bandwidth can be 
reduced from four packets to three packets by making three 
connections as in Fig. [T] Each of the three remaining nodes 
adds the stored packets and sends the sum to the newcomer, 
who can then subtract off Bi + B2 and obtain Ai + 2A2 and 
2Ai +A2. The packets Ai and A2 can now be solved readily. 
Hence, the newcomer can be regenerated exactly by sending 
three packets. 

If two storage nodes fail simultaneously, four packets per 
newcomer are required if the newcomers are repaired sepa- 
rately (see Fig. |2|i. Each of the newcomers has to download 
four packets from the two surviving nodes, reconstruct packets 
Ai, A2, Bi and B2, and re-encode the desired packets. 
However, if exchange of data among the two newcomers 
is allowed, the total repair-bandwidth can be reduced from 
eight packets to six packets (see Fig. O. The first newcomer 
gets Ai and Ai + Bi, while the second newcomer gets A2 
and 2A2 + B2- The first newcomer then figures out Bi and 
2Ai + Bi by taking the difference and the sum of the two 
inputs. The packet Bi is stored and 2Ai + Bi is sent to the 
second newcomer Likewise, the second newcomer computes 
B2 and A2 + B2, stores A2 + B2 and sends B2 to the first 
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Fig. 2. Individual repair of multiple failures. 
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Fig. 3. Cooperative regeneration of multiple failures. 

newcomer. The two newcomers are regenerated exactly with 
six packet transmissions. This example illustrates the potential 
benefit of cooperative repair. 

B. Formal Definition of Cooperative Repair 

Let Q be an alphabet set of size q. We will call an element 
in Q a symbol, and use a symbol as the unit of data. The 
data is regarded as a _B-tuple m £ Q^, with each component 
drawn from Q. The distributed storage system consists of n 
nodes, and each node stores a symbols. We index the storage 
nodes by {1,2,..., n}. There is an unlimited number of data 
collectors who want to download this data file from the storage 
nodes. 

Time is divided into stages. We index the stages by non- 
negative integers. For t > 0, let the content of the i-th node 
in the t-th stage be denoted by an a-tuple x(i,«) G Q". The 
distributed storage system is initialized at stage by setting 
x(0, i) — ei(m) for i = 1,2, . . . ,n, where e.^ : — > Q" is 
an encoding function. 

For a subset S of {1,2,..., n}, we let 

x(t,5) := {x{t,i) : ieS} 
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be the content of the storage nodes indexed by S. The design 
objective is two-fold. 

1) File retrieval. In each stage, a data collector can re- 
construct the data file by connecting to any k out of 
the n storage nodes. This property is called the {n, k) 
recovery property. Mathematically, this means that for 
any fc-subset S of {1,2,..., n} and t > 0, there is a 
decoding function 

ft,s ■■ Q'" ^ Q"" 

such that ft^s{-x{t,S)) = m for all m e Q^. 

2) Multi-node recovery. When the number of node failures 
in stage s — 1 reaches a threshold, say r, we jointly 
regenerate r newcomers, and advance to stage s. For 
s = 1,2,3,..., let TZs be the set of r storage nodes 
which fail in stage s — 1 and are regenerated in stage s. 
The set TZs contains r elements in {l,2,...,n}. For 
each storage node i G TZs, let T-Lsa be the d storage 
nodes in stage s — 1, called the helpers, from which data 
is downloaded to node i in stage s. The set "Hsa is a 
subset of {1,2,..., n}\TZs with cardinality d. Different 
newcomers may connect to different sets of d helpers. 
The repair procedure is divided into two phases. 

In the first phase, each of the r newcomers download (3i 
symbols from the d helpers. For i TZs and j G Hsa, 
the symbols sent from node j to newcomer i is denoted 

by 5s,i,j(x(s - l,j)), where 

is an encoding function. 

In the second phase, the r newcomers exchange data 
among themselves. Every newcomer send /32 symbols 
to each of the other r — 1 newcomers. For 11,^2 G T^s, 
let 

be the encoding function in the second phase, and 

y(s, 11,^2) =5s,»i,»2({5sj\»i(x(s-l,j)) : j e'Hs,^,}) 

be the symbols sent from newcomer ii to newcomer 12. 

For each i G TZs, the content of the regenerated node i, 
x(s,i) is obtained by applying a mapping 

to gs.j.i{x{s ~ l,j)) for j G Us.i and y(s, for 
ix^TZs\ 

For those storage nodes that do not fail in stage s — 1, the 
content of them do not change, i.e., x(s, i) = x(s — l,i) for 
i^TZs. 

A cooperative regenerating code, or a cooperative regenera- 
tion scheme, is a collection of encoding and decoding functions 
Gi, /t,5, gs,j,i, 5s,n,i2' ^"'l such that the (n, fc) recovery 
property holds in all stages t > 0, and for all possible failure 
patterns TZs and all choices of helper sets Tis,i, s > 1. 

A few more definitions and remarks are in order. 



« If each storage node contains Bjk symbols, then the 
regenerating code is said to have the maximal- distance 
separable (MDS) property. 

. If x(t + l,i) = x(i,i) for alH > and i G {1,2,. . 
then the regenerating code is said to be exact. 

• The repair-bandwidth per newcomer is denoted by 

-^:=d/3i + (r-l)/32. 

• The encoding functions gs,j,i, g'si^ i^, and /i^ i depend 
on the indices of the failed nodes, TZs, and the indices 
of the helper nodes, Tis.i, or possibly depend on TZt and 
'Kt,! for t < s, i.e., the cooperative regeneration scheme 
is causal. For the ease of notation, this dependency is 
suppressed. 

> The encoding and decoding is performed over a fixed 
alphabet set Q in all stages. 

> A pair (7/i?, a/B) is called an operating point. The first 
coordinate is ratio of the repair-bandwidth 7 relative to 
the file size B, and the second coordinate is the ratio 
of the storage per node a relative to the file size. An 
operating point (7, a) is said to be admissible if there is 
a cooperative regeneration scheme over an alphabet set 
Q with parameters B, a, /3 and 7, such that (7, a) = 
{"f/Bja/B). (The tildes indicate that the corresponding 
variables are normalized by the file size B.) 

• For given d, k and r, let C{d,k,r) be the closure of 
all admissible operating points achieved by cooperative 
regenerating codes with parameters d, k and r. We call 
C{d,k,r) the admissible region. If the parameters d, k 
and r are clear from the context, we will simply write C. 

• For fixed storage a per node, we choose the two pa- 
rameters Pi and /32 such that the repair-bandwidth 7 is 
minimized. For given a and B, we let 

7*(a) := min{a; : (x, d) G C((i, fc, r)} (1) 

• In single-loss failure model (r = 1), it is shown in IS) 
that we only need to consider d > k without loss of 
generality. In multiple-loss failure model (r > 1), there 
is no a priori reason why d cannot be strictly less than k. 
However, the mathematics for the case d > k is simpler 
and more tractable. In this paper, we will assume that d > 
k. We will also assume that k > 2, because cooperative 
regenerating code with fc = 1 is a trivial case. 

We summarize the notations as follows: 

B : The size of the source file. 

n : The total number of storage nodes. 

d : Each newcomer connects to d surviving nodes . 

k : Each data collector connects to k storage nodes. 

r : The number of nodes repaired simultaneously. 

a : Storage per node. 

f3i : Repair-bandwidth per newcomer in the 1st phase. 

[32 '■ Repair-bandwidth per newcomer in the 2nd phase. 

7 : Total repair-bandwidth per newcomer. 

C. Main Results and Organization 

The main result of this paper is a closed-form expression 
for the region C{d,k,r). The statement of the main theorem 
requires the following notations. 
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Definitions: For j = 2, 3, . . . , fc, define 

2{d-k + j) + r-l 



a, 



7j := 



2d + r- 1 



(2) 

(3) 



where Di is a short-hand notation for 



Dj := k{2d - 2fc + 2j + r - 1) - j{j ~ 1) (4) 
For ^ = 0, 1, . . . , lk/r\, define 



ll ■= 



d + r-1 



where 



D'e := k{d + r{e + 1) - fc) - r^e{e + l)/2. 



(5) 
(6) 

(7) 



Definitions: For non-negative integer j and positive integer 
m, let 

Aj>i := [j/mjm^ + {j - [j/mj™)^. (8) 
For j = l,2,...,fc, let 

r j(rf-fc)+(/+A,,„)/2 



if Aj,„ < jm, 



oo 



if A, 



jm. 



Let /i(0) := 0. 

We note that Ao,m = 0, Ai „j = 1 for all m > 1. Also, for 
j > 2 and m > 1, 

j < Aj^rri < jm. 

Equality Aj „i ~ jm holds if and only if j is divisible by m. 
(The quantities [j/m\ and — [j/mjm are respectively the 
quotient and the remainder when we divide j by m. The value 
of Aj „i can be interpreted as the maximum value of J2i=i 
subject to the constraints — j '^^'^ < Xi < m for 

all i. The motivation for the definition of fi{j) will be given 
in Section Iml ) 

Tlieorem 1. The admissible region C(d,k,r) is equal to the 
convex hull of the union of 



{(7j>ai) : j = 2,3, , 



,fc - 1, d< (r 



(9) 



.j,%7rj) : j = 2,3, . . . ,fc - 1, d > (r - 

(10) 



{( 



d + r-1 
k{d + r - k) 



(11) 



and 



(/ 2d + r-l 2d + r-l \ } 
\[ k{2d + r-kyk{2d + r-k) +V ' ' - V ' ^'^^ 

When r ~ 1, we define • oo = cxd /n (|9]l and ( II Ob . 

We note that each of the sets in (|9]l and (fTol ) contains at 
most fc — 2 points. The the sets in (fTTT l and (fT2] | are horizontal 
and vertical ray respectively. 
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Fig. 4. Tradeoff between storage and repair bandwidth {B = 1, d 
fe = 4, r = 3). 



There are two particular operating points of special interest. 
The first one. 



(7MSCR, fiMSCR) := (7o>Q!o) = (- 



1 1. 



'k{d + r -ky k'' 

is called the minimum-storage cooperative regenerating 
(MSCR) point. This point is the end point of the half-line ( fTTT ). 
The second one, 

2d + r - 1 



(7MBCR, <3mBCr) := {ikiO-k) 



:(i,i), 



fc(2d + r-fc) 

is called the minimum-bandwidth cooperative regenerating 
(MBCR) point. This point is the end point of the half-line 
in (O. 

An operating point (71, ai) is said to Pareto-dominate 
another point (72, 0.2) if 71 < 72 and ai < 0.2- An operating 
point (7, a) is called Pareto-optimal if it is in C{d,k,r) and 
not Pareto-dominated by other operating points in C{d, fc, r). 
In other words, a point (7, a) in the admissible region is 
Pareto-optimal if, for any other operating point (7', a') in the 
admissible region, we have either 7' > 7 or a' > a, or both. 
The MSCR (resp. MBCR) point is the Pareto-optimal point 
with minimum a (resp. 7). 

When r — 1, Theorem [T| reduces to the corresponding 
result for single-loss recovery in ||5j Theorem 1]. Indeed, we 
have ii{j) — 00 for j = 1,2, ... ,k when r = 1. Using the 
convention that • 00 = 00, the set in (|9]l contains fc — 2 
operating points 

^^^•'"^^^ 2fc(d-fc + ')-j0--l) ^^'^"^'"^'^' ^'^^ 
for j = 2, 3, . . . , fc — 1, while the set in JTOl ) is empty. The 
extreme points of C{d, fc, 1) are the points in ( fTSl l and 

/ d 1\ / 2d 2d 



Vfc(d + 1 - fc)' fc/' Vfc(2d+ 1 - fc)' fc(2d+ 1 - fc). 

As a numerical example, we illustrate the admissible region 
C(5,4, 3) (with parameters d = 5, fc = 4, r = 3) in 
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Fig. |4] The solid line (marked by squares) is the boundary 
of the region C(5,4,3). The set in (|9]l contains two points, 
namely (12/30,8/30) = (0.4,0.2667) and (12/34,10/34) = 
(0.3529, 0.2941), and the set in is empty. The MSCR and 
MBCR points are respectively (7/16,1/4) = (0.4375,0.25), 
and (1/3,1/3) = (0.3333,0.3333). For comparison, we also 
plot in Fig.|4]the optimal tradeoff curve for single-failure repair 
(marked by circles). 

In Section IVIII two families of cooperative regenerating 
codes for exact repair are constructed explicitly. Both families 
have the property d = k. The first family matches the MSCR 
point, and has parameters B = fcr, n>d + r, a — r 
and 7 = d + r — 1. The second family matches the MBCR 
point and has parameters B — k{k + r), n — d + r and 
a = 'y — 2d + r — 1. We define the storage efficiency as the 
number of symbols in the data file divided by the total number 
of symbols in the n storage node. The first (resp. second) 
construction yields regenerating codes with storage efficiency 
k/n (resp. k/{2k + r - 1)). 

In Section [III we define the information flow graph, and 
state some definitions and theorems from combinatorial op- 
timization which will be used in Section [V] In Section |III1 
we give a lower bound on repair-bandwidth with cooperative 
recovery. In The lower bound is expressed in terms of a 
linear programming problem. In Section |IV] we derive some 
properties of the linear programming problem, and discuss two 
kinds of feasible solutions of the linear programming problem. 
In Section [V] we show that the lower bound is tight. We prove 
in Section [VT] that we can construct functional-repair linear 
network codes over a fixed finite field, which match this lower 
bound on repair-bandwidth. The two explicit constructions for 
exact-repair cooperative regenerating codes mentioned in the 
previous paragraph are given in Section IVIII Appendix |A] 
discuss the scenario where the download traffic may not be 
symmetric. Some of the longer proofs are relegated to the 
remaining appendices. 

II. Preliminaries 
A. Information Flow Graph and the Max-Flow Bound 

We review the modeling of the repair process using in- 
formation flow graph as in |20|. The information flow graph 
is divided into stages; each stage involves the recovery of r 
failed nodes. Given parameters n, k, d and r, any directed 
graph G = (V, £) which can be constructed according to the 
following procedure is called an information flow graph. 

• There is one single source vertex S in stage —1, repre- 
senting the original data file. 

• The n storage nodes after initialization are represented 
by n vertices in stage 0, called Out^, for i = 1,2, . . . ,n. 

• For each j in TZg, we put three vertices in stage s: Irij, 
Midj and Outj. For each j G TZg, there is a directed 
edge from Irij to Midj and a directed edge from Midj 
to Outj. We draw d directed edges from the d nodes in 
Hsj in stage s — 1 to node j in stage s. Namely, for 
each i € we put a directed edge from Out^ in stage 
s — 1 to I n j in stage s. The exchange of data among the 




Stage -1 Stage stage 1 



Fig. 5. An example of information flow grapli G(5, 3, 2, 2; a, /9i, /32)- Nodes 
2 and 3 are repaired in stage 1 (Tii = {2, 3}). 

r newcomers are modeled by putting a directed from In^ 
to Midj for all pair of distinct i and j in TZg- 

m A data collector in stage s, called DC, is connected to k 
"out" nodes in the s-th or earlier stages. 

We assign capacities to the edges as follows. 

• The capacity of an edge terminating at an "out" node 
is a. This model the storage requirement in each storage 
node. 

• The capacity of an edge from an "in" node to a "mid" 
node is infinity. The transfer of data is inside the new- 
comer and does not contribute to the repair-bandwidth 

• The capacity from Out^ in stage s — 1 to Irij in stage s is 

for i £ Hs.j- This signifies the amount of data sent 
from Outi to In^ in the first phase of the repair process. 
For the second phase, the edge from In^ to Mid^ in stage 
s, for j, £ £ TZs with j ^ £, is assigned a capacity of 
132- (We use superscript ^''^ to indicate that the variable 
is associated with stage s.) 

• The edges terminating at a data collector are all of infinite 
capacity. 

The information flow graph so constructed is a directed 
acyclic graph. The number of stages may be unlimited and 
the information flow graph may be an infinite graph. We will 
denote an information flow graph by G{n, d, k, r; a, /3i , /32)- If 
the values of parameters are understood from the context, we 
will simply write G. An example of information flow graph 
is shown in Fig. |5] 

Definitions: Let H he a directed graph with non-negative real 
numbers, called the capacities, assigned to the edges in H. 
An {S, T)-flow in if is a function / from the edge set £ of 
H to the non-negative real numbers, such that for every edge 
e £ £, /(e) does not exceed the capacity of e, and for every 
vertex ti in 1/" \ {S,T}, the sum of /(e) over all incoming 
edges e is equal to the sum of /(e') over all outgoing edges 
e'. The value of an [S, r)-flow / is defined as the sum of /(e) 
over all directed edges e terminating at vertex T. A flow / is 
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called integral if /(e) is an integer for all e. An {S, T)-cut is 
a partition (W, VV) of the vertex set V of H such that 5 £ W 
and T G W. The capacity of an {S, r)-cut (W, VV)_is defined 
as the sum of capacities of the edges from W to W. 

The max-flow-min-cut theorem states that the minimal cut 
capacity is equal to the largest possible flow value. 
Definitions: For a given data collector DC in the information 
flow graph G, we let 

maxflow(G, DC) 

be the maximal flow value from the source vertex S to DC. 

Even though the graph G may be infinite, the computation 
of the flow from the source vertex to a particular data collector 
DC in stage t only involves the subgroup of G from stage — 1 
to stage t. For each DC the problem of determining the max- 
flow reduces to a max-flow problem in a finite graph. 

An example of flow in an information flow graph for n = 6, 
= 4, fc = 3, r = 2, a = 7, /3i = 2, /32 = 1 is shown in 
Fig. |6] The edges with positive flow are labeled (and drawn in 
red color). This is indeed a flow with maximal value, because 
there is a cut with capacity 19 (shown as the dashed line in 
Fig. H. 

According to the max-flow bound of network coding f2n\ 
||28| Theorem 18.3], if all data collectors are able to to retrieve 
the original file, then the file size B is upper bounded by 

B < minmaxflow(G', DC). (14) 

DC 

This gives an upper bound on the supported file size for 
a given information flow graph G. Since we want to build 
cooperative regenerating schemes that can repair any pattern 
of node failures, we take the minimum 

B < minminmaxflowfG', DC). (15) 

~ G DC 

over all information flow graphs G{n, d, k, r; a, /3i, /32). 
Definitions: For given parameters n, d, k, r, we denote by 

CMF{d,k,r) (16) 

the set of operating points ((d/3i + {r — 1)(32)/ B, a/B) which 
satisfy the bound in ( fTSl i. 

Any operating point not in CuFid, k,r) violates the max- 
flow bound, and hence cannot be admissible, 

CADid,k,r) C CMFid,k,r). (17) 

We note that for fixed a, Pi and (32, if B satisfy ( fTsT l. then 
(fTsT l. is satisfied for all B' between and B. Hence, if (7, ci) £ 
Cupid, k, r), then (07, ca) £ CuF{d, k, r) for all c > 1. 
Definitions: Let 

7mf(") : {x,a) € CMF{d,k,r)}. (18) 

B. Combinatorial Optimization 

Definitions: Let M be the set of real numbers, and V be a 
finite set. The cardinality of V is denoted by |V|. Let 2^ be 
the set of all subsets of V. A function / : 2^ M is called 
submodular if it satisfies 

f{S) + f{T)>f{SnT) + f{SuT) (19) 



for all 5, T C V. To show that a function / is submodular, it 
is sufficient to check 

f{S U {u}) + f{S U {v}) > fiS) + f{S U {u, v}) 

for all subsets iS C V and u,v E V (See for example f29\ 
Thm 44.1]). If (dll) holds with equality for all subsets S and 
T, then / is called modular. A submodular function which is 
invariant under all permutations of V is said to be symmetric. 
In other words, a submodular function / is symmetric if for 
all permutations tt of V, we have /(7r(iS)) = f{S), where 
tt{S) denotes the image of S under the mapping tt. 

Several examples of modular functions are given as follows. 
For a given function g : V — > K, we use g{S) as a short-hand 
notation for J2xes di^)- ^^^y verify that the function 
S M> g{S) is a modular function. Let ai > 02 > • • • > a|v 
be |V| real numbers sorted in non-increasing order such that 
fli+i — tti > ai+2 — 0,1+1 holds for z = 2, 3, . . . , |V|. Then, the 
function given by 

\^ai if p| = I, 
is a symmetric submodular function. 

For a given vector v = [vi V2 ... with non-negative 
components, we sort the components of v in non-increasing 
order and let the j-th largest component in v be denoted 
by i>[j], i.e., V[i] > V[2] > ■ ■ ■ > U[„]. Let M" be the set of n- 
dimensional vector with non-negative components. Given two 
vectors v — [vi V2 ... fn] and u — [ui U2 ... Un] in 
we say that v is majorized by u if 

^^[1] + ""[2] H \- < + U[2] H h 

for i = 1, 2, . . . , n — 1 and 

^'[1] + ""[2] H \- V[n] = + U[2] H h U[„]. 

A submodular function / mapping subsets of {1, 2, . . . , n} 
to E_|_, with the additional properties that (i) /(0) — 
and (ii) f{S) > f{T) whenever 5 ^ 7" is called a rank 
function. For a vector x — {xi, . . . ,x„) in M" and a subset 
S C {1,2,..., n}, we use the notation 

x('S) := ^Xi. 

ies 

Given a rank function /, the set 

{x £ M'l : x(5) < f{S), V5 C {1, 2, . . . , n}} 

is called a polymatroid. The face of the polymatroid consisting 
of the points satisfying — /({Ij 2, . . . , n}) is called 

the base-polymatroid associated with the rank function /. 

Lemma 2. Let u £ r[|^' be a vector with non-negative 
components. Then the function / : 2^ ^ R_(_ defined by 

\s\ 

i=l 

is a symmetric submodular function. The vectors in mI^I^ 
which are majorized by u form a base-polymatroid with rank 
function f. 
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The proof is straightforward and is omitted. We give one 
numerical example for Lemma |2] Let u be the vector [2 2 1]. 
The vectors in M"^ which are majorized by u form a base- 
polymatroid 



{xe 



ll :x(5) </(5),V5 5{l,2,3}, 
and xi + X2 + X3 = 5}, 



where 



if 5 = 

2 if IS-] = 1 

4 if IS"! = 2 

.5 if 151 =3, 



is a symmetric submodular function. 

We need a generalization of the max-flow-min-cut theorem. 
The generalization is in terms of the notion of submodular 
flow, which we introduce below. 

Definitions: Let H — (V, £) be a directed graph. For a given 
subset T of V, define the set of incoming edges and the set 
of out-going edges respectively by 

■.= {e=iu,v)e£ -.u^ r, V e T}, 
A^* ■.^{e={u,v)e£ -.ue T, v T}. 

Let : £ — > M be a function on the set of edges of H. Define 
the boundary of a subset of vertices T G V by 

d^iT) := m?') - mw)- 

The boundary of T can be considered as the net flow leav- 
ing 7". Given a submodular function / : 2^ R, we say that 
a function : £ ^ K is an f -submodular flow, or simply 
submodular flow, if 



50(r) < /(T) 



(20) 



for all r C V. Let lb : f R U {-00} and nh : S ^ 
R U {00} be respectively lower and upper bound on edges, 
with lb(e) < ub(e) for each e G £. A submodular flow is 
said to feasible if lb(e) < 0(e) < ub(e) for all e G f . 



The following theorem, due to A. Frank lf30l lISTl Theorem 
12.1.4], characterizes the existence of submodular flow. It is 
central in the proof of the main theorems in this paper 

Theorem 3 ( f30i). Suppose that p is a submodular function 
on the vertex set of a directed graph (V, £), satisfying p(0) = 
p{y) = 0. There exists a feasible submodular flow if and only 
if 

ub(A™) < p{S) (21) 



lb(A°"*) 



for all subsets iS C V. Moreover, if lb, ub and p are integer- 
valued, then there is a submodular flow which is integer- 
valued. 

To see that the max-flow-min-cut theorem is a special case 
of Frank's theorem, we consider a weighted directed graph 
H = {V, £) with two distinguished vertices S and T. Define 
a function g : V — ?> R by 



B 



B 







\f x^S, 
if a; = T, 
otherwise, 



where i? is a positive real number. For a subset W of the 
vertex set, define 



For e G £, let lb(e) — and ub(e) be the corresponding edge 
capacity. The function p so defined is modular and satisfies 
p{9) — p{V) — 0. With this choice of submodular function p, 
a feasible p-submodular flow is equivalent to a flow with value 
B. Indeed, it is easy to see that an (5, T)-flow with value B 
is a feasible p-submodular flow. In the reverse direction, let </> 
be a feasible p-submodular flow. We have 

B < -a0({r}) = dcj^iv \ {T}) = ^-^({"l) 

< d(l){{S}) < B. 
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The inequalities follow from the definition of the submodular 
function p, 

{B ifx = S 
-B if x^T 
if S^x^T. 

Since equalities hold in the above chain of inequalities, we 
have d<j>{{v}) = for all vertices v other than S and T, i.e., 
<j) satisfies the flow conservation property. The p-submodular 
flow 4> is indeed a flow. From — (90({r}) = B we conclude 
that the flow has value B. 

For a subset W of the vertex set not containing S nor T, the 
condition in ( 1211 1 is equivalent to requiring that the capacity 
of the cut (VV, W) is at least B. Hence, if all (5, r)-cuts have 
capacity at least B, then, by Frank's theorem, we can obtain 
a feasible p-submodular flow, which is an {S, r)-flow with 
value B. If there is a cut with capacity strictly less than B, then 
the condition ( 1211 1 is violated by some subset S of the vertex 
set, and thus there does not exist a feasible p-submodular flow. 

III. A Cut-set Lower Bound on Repair-Bandwidth 

Consider a data collector DC which downloads data from 
k storage nodes. By re-labeling the storage nodes, we can 
assume without loss of generality that the DC downloads from 
nodes 1 to k. Suppose that among these k nodes, £q of them 
do not undergo any repair, and the remaining k ~ £o nodes 
are repaired in stage 1 to s for some positive integer s. For 
j — l,2,...,s, suppose that there are £j nodes which are 
repaired in stage j and connected to the data collector DC. 



We can check that £n 



+ is = k and 1 < £i < r 



(for j > 1). After some re-labeling, we assume that the £o 
unrepaired nodes are node 1 to node £o, the nodes which are 
repaired in stage 1 are node i!o + 1 to node £o + £i, and so on. 

In the information flow graph, the data collector DC is 
connected to £j "out" vertices in stage j. A cut (W, VV) with 
VV consisting of the data collector DC, the £o "out" vertices 
in stage associated with nodes 1 to £q, and 



y {In,;, Mid,;, Out,} 

in stage j, for j — 1,2, ... ,s, is called a cut of type 

(4,^1, 4,..., 4). 

An example of cut (W, VV) of type (2,1,1,2) is shown in 
Fig. |2] Nodes 3 and 4 are repaired in stage 1, nodes 4 and 
7 are repaired in stage 2, and nodes 5 and 6 are repaired 
in stage 3. The data collector connects to nodes 1 to 6. The 
vertices in VV are drawn in shaded color in Fig. |7] 

Theorem 4. For any (s + l)-tuples of integers (£o,^i, • • • 
satisfying X]j=o ~ ^ '^^^ 1 ^ £j ^ i" for all j, the file size 
B is less than or equal to 



s j-1 
j=l i=0 



(22) 




Stage 1 



Stage 2 



Stage 3 



Fig. 7. A cut of type (2, 1, 1, 2) in a distributed storage system with 
parameters d = 6, fc = 6 and r = 2. 



(The tenn d — ^ 



=0 ^« 



is nonnegative, because the 



summation of £i's is no larger than k, and k is assumed to be 
less than or equal to d.) 

Proof: It suffices to show that the capacity of a cut of 
type {£q,£i, . . . ,£s) is larger than or equal to (l22l i. 

The sums of capacities of the edges terminating at the £o 
"out" vertices in stage is iga. This is the first term in ( |22] |. 
For j = 1, 2, . . . , s, consider an "in" vertex in stage j in VV. 
Out of the d incoming edges to this "in" vertex, there may be 
as small as d — J2iZo ^* edges which start from some "out" 
vertices in W. The sum of the capacities of the edges from 
W to the "in" vertices in stage j is larger than or equal to 

£,{d-J2^^)l3i- 

i=0 

This is the first term inside the square bracket. The second 
term in the square bracket is the sum of edge capacities to the 
"mid" vertices in VV. ■ 

Theorem 5. If a data file of size B is supported by a 
cooperative regenerating code with parameters n, d, k, r, a, 
j3i and P2, then for j — . . . ,k, we have 



l<{k-i)-+3 



d-k- 



J 



1 



and 



l<{k-j)-+[j{d-k)+ 2 



^1 ^ ■< 



B 



1)^ (23) 



(24) 



where Aj.r given in (O. 

Proof: The upper bound in ( 1231 ) comes from a cut of type 

(fc-i,l,l,...,l). 

V ' 

j 

There are j + 1 components in the above vector The derivation 
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of ( |23T l follows from 

= (d - k + s) + {d - k + s 
d - A: H — 



l) + --- + (d-fc+l) 



The upper bound in (l24l) comes from a cut of type 

(fc- j, r, r,.^..,r , R), 
Q 

where Q and R are defined respectively as the quotient and 
remainder when we divide j by r. {Q and R are integers 
satisfying j ~ Qr + R and < i? < r.) There Q + 2 
components in the above vector. 

Straightforward calculations show that the right-hand side 
of (EHi is 



{k - j)a + 



d - k + j 



+ {d-k + j- Qr)Rl3i + R{r - R)(32. 

Note that Aj is equal to Qr^ + R^. In terms of Aj,r, the 
coefficient of /32 is 



R{r -R)^ Rr-R^ ^ {j - Qr)r 
and the coefficient of /3i is 



r:' 



: jr 



d-k 




J(.d- 


k + j) 


j{d- 


k + j) 


j{d- 


k) + ^ 



2 



Qr+{d-k + j-Qr)R 
Qr^ + QrR 



This proves the inequality in 

Remark: (i) When j — 0, the two inequalities in 
and (l24l) are identical and can be simplified to 

B < ka. (25) 

(ii) When j = 1, (|23] i and ( |24] | are also identical and can 
be written as 



B <{k^l)a + {d-k+ l)/3i + (r - 1)(32 



(26) 



(iii) The coefficients of a, /3i and /32 in (|23] ) and ( |24] | are 
non-negative. 

(iv) In the special case of single-loss repair, i.e., when r = 1, 
the coefficients of P2 in (l23]) and (|24|) vanish. 

Example: We can now show that the example of cooperative 
regenerating code mentioned in the introductory section is 
optimal. The system parameters are B = 4, a — 2 and 
d = k = r = 2. After putting j = 1 and j = 2 in Theorem |5] 
the inequality in ( l24b become 

4<2 + ^i+/32, 
4 < 4^1. 



This implies that Pi > 1 and > 1. If we want to minimize 
the repair-bandwidth 7 = 2/3i + /32, the optimal solution is 
attained at (/3i,/32) = (I7 !)■ The optimal repair-bandwidth 
is thus equal to 3. The above analysis also shows that if the 
repair-bandwidth is equal to 3, the values of (3i and (32 must 
both equal to 1. This is indeed the case in the example given 
in the introduction. 

We note that the bounds in ( l23T l and (|24] | depend on a, 
/?! and /32 through the ratio a/B, /3i/B and P2/B. We can 
thus consider the normalized Pi/B and 1^2/3 as variables. 
We have the following linear programming problem. 
Definitions: Let a := a/B, Pi Pi/B, /32 := P2/B, 
and 7 :— ^jB be the normalized values of a, P\, P2 and 
7 respectively. Consider the following optimization problem: 

Minimize(d/3i + (r - 1)^2) (27) 

subject to the linear constraints in ( l23T l and ( l24b . for j = 
l,2,...,fc, and /3i,/?2 > 0. This is a parametric linear 
programming problem with a as the parameter Let 

7^P(«) (28) 
be the optimal value of this linear program, and 
CLp(n,fc,d,r) =Clp :-{(7,a) gK' : 7 > 7£p(«)}. (29) 
At this point, we have estabUshed the following relationship 
Cad ^ Cmf C Clp. (30) 

The first inclusion follows from the max-flow bound in 
network coding. Since the number of sink nodes in the 
information flow graph is infinite, we cannot apply existing 
results in the construction of network codes, such as the Jaggi- 
Sanders et al.'s algorithm, to guarantee that we can work 
over a fixed alphabet set for infinitely many data collectors. 
The second inclusion follows from a weaker form of max- 
flow-min-cut bound in graph theory, namely, the value of any 
flow is no larger than the capacity of any cut, since we only 
consider some specific cuts in the information flow graph, not 
all possible cuts. In later sections, we will show that equalities 
hold in 

Example: Consider a cooperative-repair-based distributed 

storage system with parameters d — 5, k — 4 and r = 3. 
We have following constraints based on ( |23]) and ( l24b : 



'B 




"4 





0' 


B 




3 


2 


2 


B 




2 


6 


2 


B 


< 


2 


5 


4 


B 


1 


12 





B 




1 


9 


6 


B 







17 


2 


B 







14 


8 



(31) 



with the inequality compared componentwise. We normalize 
B to 1 unit and consider the case when a = 1/4, i.e., the 
minimum-storage case. We minimize d/?i + (r— 1)/32 subject to 
/?!, /32 > and the constraints in ( |3T] ) by linear programming. 
The linear constraints and the objective function are illustrated 
graphically in Fig. [8] The seven solid lines (in blue color) in 
Fig. [8] are the boundary of the half planes associated with the 
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0.1 r 



0.08 



0.06 



0.04- 



0.02 




0.02 0.04 0.06 0.08 



Fig. 8. Repair-bandwidth minimization as a linear program (d = 5, fc = 4, 
r = 3, and ct = 1/4). The objective function 5/3i + 2/32 is minimized at 
/3i = /32 = 0.625. 



seven constraints (row 2 to row 8) in ( |3TI ). The point Pi is 
the intersection point of the straight line associated to row 2, 
i.e., B = 3a + 2/3i + 2/32, and the line = 2^2- The point 
P2 is the intersection point of the straight lines associated to 
rows 3 and 4 in ( [3T] l. The point P3 is the intersection point 
of the straight lines associated to rows 5 and 6, and so on. 
The feasible region is the area to the right and above these 
seven straight lines. The optimal solution /3i = /32 = 0.625 
is indicated by a square in Fig. |8] and the objective function 
is shown as a dashed line (in red) passing through this point. 
The repair-bandwidth cannot be less than 

7*p(l/4) = (d + r- 1) -0.625 == (5 + 3 - 1) • 0.625 = 4.375. 

The bound in Theorem|5]is based on the assumption that the 
download traffic is homogeneous. In Appendix [A) we show at 
the minimum-storage point, the relaxation of the homogeneity 
in download traffic does not help in further reducing the repair- 
bandwidth. In the remaining of this paper, we will assume that 
the download traffic is homogeneous. 

IV. The Two Types of Operating Points 

In this section we explain the two sets of operating points 
in ^ and (fTol i. For j = 2, 3, . . . , fc, the point (7^, aj) is said to 
be an operating point of the first type. For £ = 0, 1, ... , \_k/r\, 
the point (7^, ci^) is said to be an operating point of the second 
type. 

Definitions: For j = 1, 2, . . . , fc, we let Lj{a) be the straight 
line in the /3i-/32 plane with equation 



B = ik-j)a+{jid-k) + 



)/3i + 0>-A,-,)/32 

dMl) 

and L'j{a) be the straight line with equation 

B^ik- j)a + (j (d -k) + + [jr - ])P2. m) 

Here, a is treated as a fixed parameter. 

We record some easy facts in the following lemma. 



Lemma 6. 

1) 7£p(a) = oo for a < B/k, and is monotonically 
decreasing as a increases. 

2) For j — 1, 2, . . . fc, the slope of the straight line L'j(a) 
is 

d-k+jj + l)/2 
r-1 

and the magnitude is strictly less than d/(r ~ I) when 
r>2. 

3) If j is an integral multiple of r, then the slope of the 
straight line Lj(a) is infinite. 

4) The line Li{a) is identical to line L[{a). When r = I, 
the line Lj{a) is identical to line Lj{a), for 2 < j < k. 

5) For r > 2 and 2 < j < k, the magnitude of the slope of 
Lj{a) is strictly larger than the magnitude of the slope 
of L'j (a). Lj (a) and L'j (a) intersect at a point lying on 
the line j3i — 2/^2 in the Pi- plane. 

Proof: 

^ \f a < B /k, the constraint in (l25b is violated. Thus 
7£p(q;) = oo for a < B/k. As a increases, the feasible 
region of the linear program in Theorem |5] is enlarged, and 
thus the minimal value decreases. 

(|2| The line L'j{a), whose equation is given in (|23T). has 
magnitude 

j(d^k) + {f+j)/2 ^ d-k + {j + l)/2 
jr-j r-1 

This fraction is strictly less than d/{r — 1) for r > 2 and 

J < k. 

(O It follows from the fact that = jr when j is a 
multiple of r. 

(HI When j = 1, we have = j ~ I for all r. When 
r = 1, we have Aj ,. ~ j for j = 2, 3, . . . , fc. 
(|5]l For j = 2, 3, . . . , fc — 1, the determinant 



j{d-k) + 
j{d-k)^ 



1 



jr - A 
jr 



j 



is equal to 



j{Aj,r-j)[d-k+{r + j)/2]. 



Since A^ > j for j > 2, and d > k hy assumption, the 
determinant is positive, and thus the magnitude of the slope 
of L'{a) is strictly larger than the magnitude of the slope of 
L{a). By subtracting (|23j) from (|24f), we obtain Pi = 2;32 
after some simplifications. ■ 
Definitions: For j = 1,2,3, ... ,k, let Pj{a) be the intersec- 
tion point of Lj{a), L'j{a), and on the line f3i = 2/32- 

Lemma 7. For j > 1, the coordinates of Pj{a) is 
B-ik- j)a 



j{2d-2k + r + j 



-(2,1). 



(32) 



Proof: Put Pi = 2/32 in (|23l). ■ 

For d = 5, k = A and r — 3, the points Pi{l/A), for 
i — 1,2, 3, 4, are shown in Fig. [8] By part|5]of Lemma |6] we 
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can explicitly calculate their coordinates: 

Pi(l/4) = (1/12,1/24) = (0.0833,0.0417) 
P2(l/4) = (1/14,1/28) = (0.0714,0.0357) 
P3(l/4) ^ (1/16, 1/32) = (0.0625,0.0313) 
P4(l/4) = (1/18, 1/36) = (0.0556,0.0278). 

If we increase a, the points -Pi (a) to Pfe(a) will "slide 
down" along the line /3i = 2/?2 with various speed. Let j = 
2, 3, . . . , fc. In the following, we compute the value of a such 
that Pj{ct) and Pj^i{a) coincide. We have akeady shown in 
Lemma |6] that f3i = 2/32- Hence, it suffices to solve 



B^(k-j)a+(j{d^k) + 



B 



.f + J 



{k - j + l)a + [^{j - l)id 

+ (j-i)(^-i)y- 



k) 



)/3i+j(^-l)y 



By subtracting one of the above equations from the other, we 
obtain 

2B 



Pi 



k{2d-2k + 2j + r-l)-j{j - I) 



2B 
IT, 



a = {2{d-k + j)+r-l) — . 
The corresponding repair-bandwidth 7 is 

7 = + (r - l)/?2 = {2d + r- 1)^. 

This gives the operating point of the first type in (|9]l. The 
constant dj in (|2]i is defined such that Pj{dj) = Pj^i{dj). 

In Fig. [8] we note that the lines Li(l/4), L2(l/4), and 
£3(1/4), intersect at the same point on the line (3i — (32- 
The operating points of the second type are obtained by 
generalizing this observation. For notational convenience, we 
let Lo{a) be the set of points in the /3i-/32 plane satisfying the 
equation B — ak, i.e., it is either the whole plane if a = B/k 
or the empty set if a > B/k. 

Lemma 8. Let £ be an integer between and \ k/r\. We can 
choose a such that Lj{a), for j = Ir, £r + 1, . . . , ir + r, 
and the line /3i — (32 have a common intersection point in the 
Pi- 132 plane. 

Proof: Let j be an integer between tr and tr + r. We can 
write j — Ir + c for some integer c in the range < c < r, 
and get 

The equation of Lj{a), for £r < j < £r + r, is 

B ^ {k-ir-c)a+ (^{er + c){d - k) 



(2^2 ^ 1^2 ^ 2lrc + 2c' 



/3i + c(r - c)/32 



Substitute (3i 
it as 



(32 into the above equation, we can re-write 



B 



+ (fc 



[d~k- 
£r)a + 



er{d-k) + r^l 



'+l)/2 (3i. 




Fig. 9. The linear programming problem for d = 19, k = 18, r = 3, 
_B = 1, and a = 7/117 = 0.0598. The objective function IQ^Si + 2/32 is 
minimized at the point Qi = (1/117, 1/117) = (0.00854, 0.00854). 



When a = [d + r{£ + 1) — k](3i, we can eliminate c and get 

B = /3i [{k -£r){d-k + r{£ + 1)) + £r{d - k) 
+ r^£{£+l)/2, 

which can be further simplified to 

B B 

k{d + r(£ + 1) - fc) - r'^£{£ + 1) /2 ^ 



Pi = 



Hence, when a = B{d + r{£ + 1) - k)/D[, the point 
{B/Dg, B/D'^) in the P1-P2 plane is an common intersection 
point of Lj (a), for j = £r, £r + I, . . . ,£r + r. ■ 
Definitions: For ^ = 0, 1, 2, . . . , [fc/rj, define Qg as the point 



Q, ■.= iB/D'„B/D',] 

in the P1-P2 plane. 

We note that when £ — 0, we have 



(33) 



Qo = {B/{k{d + r - fc)), B/{k(d + r - k))). 

An illustration is shown in Fig.|9] The point Qi is the point 
marked by a square on the line Pi = ^2- 

Remarks: In the special case of single-loss recovery, i.e., 
when r = 1, the variable (32 can take any value without 
affecting the repair-bandwidth, because the second phase of 
repair is vacuous. The line Lj{a) and L'j{a) representing 
the linear constraints are vertical lines in the P1-P2 plane. 
Naturally, we take /?2 = when r = 1. However, in order 
to give a unified treatment covering the two cases r = 1 and 
r > 2, we define Pj{a) and Qi for all r > 1, even though 
the P2 coordinates of Pj{a) and are nonzero. The results 
in the next two sections hold for both r = 1 and r > 2. 

V. Construction of Maximal Flow 

To ease the presentation, we modify the information flow 
graph by adding more "out" vertices, so that in each stage, 
each storage node is associated with a unique "out" vertex. If 
the storage node is not repaired in stage s, we draw a directed 
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Stage Stage 1 Stage 2 




Fig. 10. An example of modified information flow graph (n = 6, d = 4, 
fc = 3, r = 2, a = 7, ^1 = 2, ^2 = 1). 

edge with infinite capacity from the "out" node in stage 
s — 1 to it. With the addition of these new vertices, all inter- 
stage edges are between two consecutive stages. A modified 
information flow graph is denoted by G™(n, d, k, r; a, f3i, ^2)- 
The modified information flow graph ^"(6, 4, 3, 2; 7, 2, 1) 
derived from the example in Fig. |5]is shown in Fig. [TO] 

We consider the cuts of the modified information flow 
graph which separate two consecutive stages. The flow pattern 
through such these "vertical" cuts are captured by the vectors 
defined in the following definition. 

Definitions: Consider a data collector DC in stage s, s > 0. 
Given a particular flow F on G, for t = 0, 1, 2, . . . , s — 1, let 
h*^*^ = [/ij*"* ft,2*'' • ■ • hn^] be the n-dimensional vector whose 
i-th component is the sum of flow through the vertex Out; in 
stage t. We call the vector h'*^ a flow pattern. 

We will use superscript to signify that a vector or variable 
is pertaining to stage t. 

For example in Fig. [TO] the in-flows of the six vertices in 
stage are respectively 0, 0, 2, 4, 6, and 7. Hence 

h^") = [0 2 4 6 7]. (34) 

The first two components are zero because node 1 and 2 fail in 
stage 0, and they do not have any out-flow. In stage 2, nodes 
3 and 4 are repaired. We get 

h(i) = [7 7 2 3]. (35) 

We note if j G TZt, then node i fails in stage i — 1, and the i-th 
component in h^*"^^ is equal to zero. Finally, the three edges 
which terminate at the DC have flow 7, 7 and 5 respectively, 
and we get 

h(2) = [7 7 5 0]. (36) 

As a simple non-example, we note that any vector with a 
component strictly larger than a is not transmissive in any 
stage. 

Definitions: A vector v G K" is called transmissive at 
stage s (s > 0), if in any modified information flow graph 
G"^{n, d, k, r; a, /32), we can assign flow 0(e) to the edges 
e in or before stage s, such that (i) 0(e) does not exceed the 
capacity of edge e, (ii) the sum of in-flow is equal to the sum 
of out-flow for all vertices in stage 1 to s — 1, and (iii) such that 



the in-flow of the i-th "out" vertex in stage s is equal to the 
ith component in v. A vector v G M" which is transmissive 
at all stages is called transmissive. 

Some comments on transmissive vector and flow pattern is 
in order (i) A flow pattern is always attached with a data 
collector, but the definition of transmissive vector does not 
involve any data collector, (ii) a vector which is transmissive 
in one stage may not be transmissive in another stage. For 
example, in stage 0, the vector [a a . . .a] with all components 
equal to a is transmissive, but it is not a valid flow pattern 
(unless we are in the trivial case that n ~ k), and is not 
transmissive in stage 1. 

For the operating points of the first type, we have the 
following theorem. 

Tlieorem 9. Let z be an integer between and k — 2, and let 

the parameters of the distributed storage system be 

a — 2{d ~ z) + r — 1, and 
/3i=2,/32 = l. 

Then the max-flow to each DC is at least 

B = k{2d + r- k) - z - 

In fact, any vector h G K" which is majorized by 

[ a a aa-2a-4 ...a-2(fc-z-l) ^_^^^]. 

z limes n-k times 

(37) 

are transmissive. Furthermore, if the components o/h are non- 
negative integers, then we can choose an integral flow in the 
modifled information flow graph. 

We can prove By Theorem |9] that the vector h(o) in (O, 
h'^^ in ((35), and h^^^ in ( [36] l are all transmissive in the 
modified information flow graph in Fig. [TO] Indeed, they are 
all majorized by [7 7 5 0]. We can prove that they are 
transmissive by applying Theorem |9] with z = 1. 

We check that the sum of the components in dJTj i is equal 

to B, 

ka- 2i 

= k{2d - 2z + r - 1) - {k - z - l){k - z) 
= k{2d + r- k) - z - z^ = B. 

Let V be the set of vectors in K" which is majorized by the 
vector in ( |37] |. By Lemma |2l the set of vectors V is a base- 
polymatroid 

{x G Rl : x{S) < f{S) for S C {1,2, . . . ,n} 

n 

and Xi — B} 

i=l 

associated with the rank function / : {0, 1, 2, . . . , n} M> Z+ 
given by f{S) = 9\s\, where 6j is defined by 

min(fc,j') — 2 — 1 

Oj ■.^inm{k,j) ■ a- ^ 2i (38) 

i=0 
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Fig. 1 1 . An example of the auxiliary graph. 



for j = 0, 1, 2, . . . , n. (If the upper limit of a summation is 
negative, the summation is equal to by convention.) We 
check that Ox is equal to the sum of the first x terms in dJTl l. 
and Ok = Ok+i = ■ ■ ■ On — B. 

The proof of Theorem |9] relies on the trellis structure of 
the modified information flow graph. The subgraph obtained 
by restricting to one stage are isomorphic to the subgraph 
obtained by restricting to another stage. This allows us to 
simplify the analysis to only one stage. Consider the subgraph 
of the modified information flow graph consisting of the 
vertices in stage s and the n "out" vertices in stage s — 1. We 
call this the auxiliary graph, and let V' be the vertex set of this 
auxiliary graph. By re-labeling the storage nodes, we assume 
without loss of generality that nodes 1 to r are regenerated in 
stage s. The first r "out" vertices in stage s — 1 is disconnected 
from the rest of the auxiliary graph. In order to distinguish the 
"out" vertices in stage s — 1 and s, we re-label the n — r "out" 
vertices in stage s — 1 by u^+i, Wr+2, •■•,«„. An example for 
n = 6 and d = r = 3 is given in Fig. [TT] 

The construction of flow in Theorem |9] is recursive. We 
consider the vertices on the left-hand side of the auxiliary 
graph as input vertices and the vertices on the right as output 
vertices. Let h be a vector majorized by the vector in ( l37T i. 
The vector h is regarded as the demand from the "out" nodes 
in the auxiliary graph. We look for a valid flow assignment in 
the auxiliary graph such that the flow to each "out" vertices is 
equal to the corresponding components in h, and meanwhile 
the input flow assignment is majorized by ( l37b . 

We now specify a submodular function cr : 2^ ^ K+. 
Let Os-i be the set of "out" vertices {vr+i,Vr+2, ■ ■ ■ ,Vn} 
in stage s — 1, and Os be the set of "out" vertices 
{Outi, Out2, . . . , Out„} in stage s. Given a subset S of 
vertices in the auxiliary graph, define 

a{S) := f{SnOs-i)-h{SnOs). 
The notation h{S DOg) in the above definition means 

i 

OutiCS 



The function (j{S) is submodular because it is the sum of 
a submodular function f{S D Os-i) and a modular function 
-h{S n Os). Also, we note that 

a(V') = /(a-i) - h{Os) = B-B = 0. 

We define upper bounds and lower bounds on the edges in 
the auxiliary graph as follows. For i = r + l,r + 2, . . . ,n, the 
edge joining Vi and Out; has lower bound and upper bound 
equal to hi. An edge terminating at an "in" vertex has lower 
bound and upper bound An edge from In^ to Mid^ for 
i 7^ j, has lower bound and upper bound (^2, while an edge 
from from In; to Mid^ for i — j, has lower bound and upper 
bound oo. An edge from a "mid" vertex to an "out" vertex 
has lower bound and upper bound a. We summarize the 
lower and upper bounds on the edges in the auxiliary graph 
as follows. 



Edge e 


a{e) 




(vi, Outi) 


hi 


h. 









{\n,,M\dj),i^j 







(lni,Midj),i = j 





oo 


(Mid,, Out,) 





a 



To apply Theorem |3] we need to verify that condition (|2T]) 
holds for all subsets S C V". 

Lemma 10. With notation as in Theorem^ we have 

lb(A-*)-ub(A™)<p(5), (39) 

for all S C V'. 

The proof of Lemma [10] is given in Appendix |B] 

Proof of Theorem^ We proceed by induction on stages. 
Let h be a vector in V. Since each component of h is less 
than or equal to a, we can always assign flow on the edges 
from the source vertex to the vertices in stage such that 
= h, without violating any capacity constraint. Hence h 
is transmissive at stage 0. 

Suppose that all vectors in V are transmissive in stage s — 1. 
Consider the auxiliary graph consisting of the vertices in stage 
s and the n "out" vertices in stage s — 1. By applying Frank's 
theorem (Theorem O, there exists a feasible submodular flow, 
say (j), on the auxiliary graph. The flow conservation constraint 
is satisfied for the "in" and "mid" vertices in the auxiliary 
graph, because by definition cr(lni) = a{M\Ai) = 0. Since the 
submodular flow guaranteed by Frank's theorem is feasible, 
the capacity constraints in the modified information flow graph 
are satisfied. 

If we take any subset A of the "out" vertices in stage s — 1, 
from the definition of a submodular flow, we obtain 

d<jy{A) = cj>{/^T) < ^{-A) = f{A). 

The "input" in the (s — l)-st stage is thus transmissive at stage 
s — 1. By the induction hypothesis, we can assign real values to 
the edges from stage — 1 to s — 1 in the modified information 
flow graph, such that the flow conservation constraint is 
satisfied, and the in-flow of the "out" vertices Out^ in stage 
s — 1 is precisely the inputs of the corresponding vertices in 
the auxiliary graph. 
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Finally, we want to show that 

for i — 1,2, ... ,n. By the defining property of a submodular 
flow again, we have 

dcbiiOuU}) = -mfouu}) < -h^- 

If we take S be the subsets 

S = {Itii, In2, . . . , Irir, Midi, Mid2, . . . Mid^} 

we have 

= a{S) > = 0(Ar ) - 0(AF) 

n 

> ^ /l, - <7{{vi,V2, Vn}) =B-B = 0. 

1=1 

Therefore, all inequalities above are in fact equalities. Thus 
(/)(A|Qjj^ j) = hi for all i. This gives a flow on the s-th stage 
of the modified information flow graph yielding the desired 
vector h. This proves that h is transmissive at stage s. 

If the components of h are non-negative integers, then the 
result follow from the second statement of Frank's theorem. 
This completes the proof of Theorem |9l ■ 

Theorem 11. For j = 2, 3, . . . , fc, the operating point (fjj, cxj) 
(defined in (O and is in Cmf('^, k, r). 

Proof: Consider a data collector DC who connects two 
k storage nodes in stage s. We want to construct a flow from 
the source node to DC such that the flow of the k links from 
the k storage nodes to the data collector be precisely the non- 
zero components in ( l37T i. The require flow pattern is certainly 
majorized by ( |37] |. By Theorem |9l we can always find a flow 
meeting the requirement of this data collector, regardless of 
which storage nodes failed in earlier stages. Hence, for z = 
0, 1, . . . , fc — 2, the operating point 

is in Cmf(c?, k, r). After a change of the indexing variable by 
k = z + j, we see that, for j — 2,3, . . . , k. 

il3,a,) = ^{2d + r-\,2{d-k + 3)+r-l) 

is in CMF(d, k, r). ■ 

Analogous to Theorem |9] and Theorem [TT] we have the 
following two theorems. 

Theorem 12. Let i be an integer between and \k/r\, and 
let the parameters of a distributed storage system be 

a = d + r{£ + 1) — fc, and 

Pi^h^ 1. 

The max-flow to each DC is at least 

B = k{d + r{i + 1) - fc) - r'^£{i + l)/2. 



In fact, any vector h G M" majorized by 

[ a . . . a a — r . . . a — r a — 2r . . . a — 2r 

k—£r times r limes r times 

■ ■ ■ ■ ■ ■ a-ir O_._0] (40) 

r tiines n — k times 

is transmissive. Furthermore, if the components of h are non- 
negative integers, then we can choose an integral flow in the 
modified information flow graph. 

Theorem 13. For I — 0,1,2,..., [fc/rj, the operating point 
(7^, a^) {defined in (|5]l and (|6]l) is in Cmf(c?, k, r). 

The proof of Theorem [12] is given in Appendix |C] 

In summary, we have shown in Theorems [TT] and [13] that 

7MF(aj) < 7i, 
for j = 2, 3, . . . , fc, and 

7mf("^) < 

for £ = 0, 1, . . . , [fc/rJ . Next, we show that under the 
conditions in (O and ( ITOl i. these are indeed the optimal value 

of 7MF(ai) or 7MF(a^)- 

Theorem 14. If d < ir — 1)^*0) for any j = 2,3, . . . ,k, then 

7£p(aj) = 7MF(aj) = Ij- 
If d > (r — l)fi{j) for any j = 0,1,2, ... ,k, then 

7Lp(a[j/rJ ) = lMFi^[j/r\ ) = I'lj/r] ■ 

Proof: The proof is based on the following fact in linear 
programming with two variables. Suppose the slope of the 
objective function is m, and x is a feasible solution. If x 
satisfies two linear constraints, one with slope larger than or 
equal to m and one with slope smaller than or equal to to, 
with equality, then x is the optimal solution. Recall that the 
objective function of the linear program considered in this 
paper has slope —d/{r — 1), which is cxi when r = 1. 

For the first statement in the theorem, the point Pj{aj) in 
the /3i-/32 plane is a feasible solution by Theorem [TT] and 
meets the two linear constraints corresponding to Lj {ctj ) and 
L'j{aj) with equality. The optimality of Pj{aj) follows from 

(i) line Lj{aj) has magnitude p{j), which is larger than or 
equal to the magnitude of the objective function by hypothesis, 

(ii) line L'j{aj) has magnitude less than d/{r — 1) (See 
Lemma [6] part (O). Therefore Pj{aj) is the optimal solution. 
The corresponding a and 7 are aj and 7^ respectively. 
Therefore 7MF("i) = Ij- 

For the second statement in the theorem, the point Qe in the 
Pi- (32 plane is a feasible solution by Theorem [T3] The point 
Qi meets two linear constraints with equality. The first one is 
the inequality associated with line L^^y^j (aj^y^j ), which is a 
vertical line. The second one is the inequality associated with 
line Lj{a'yj^^^), whose slope has magnitude strictly less than 
d/ {r — 1) by hypothesis. Therefore Qi is the optimal solution 
to the linear program. The corresponding a and 7 are <^'y/r\ 
%j/ri respectively. Therefore 7MF("[i/rj ) = %j/r\ ■ 
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We check that the MSCR and MBCR point are boundary 
points in Cmf- Firstly, the slope of the line Li{a) has 
magnitude 

d-k + l 
r-1 ' 

which is strictly less than d/{r — 1). We get 

TlpIq^mscr) = 7mf(<5^mscr) = 7o = 7mscr- 
Secondly, the condition d < {r ~ reduces to 

which holds for all positive integers d, k and r. Hence 

7lp('5mbcr) = 7mf('5^mbcr) = 7fc = 7mbcr- 

Consequently, we have shown that the operating points in 
Theorem [T] are boundary points in Clp and Cmf- The next 
theorem shows that there is no gap between Clp and Cmf- 

Theorem 15. Clp = Cmf- 

The proof of Theorem [15] is technical and is given in 
Appendix [D] 

VI. Linear Network Codes for Cooperative Repair 

In this section, we show that the Pareto-optimal operating 
points in Cmf can be achieved by linear network coding, with 
an explicit bound on the required finite field size. 

Let Fg denote the finite field of size q, where g is a power of 
prime. The size of will be determined later in this section. 
We normalize the unit such that an element in F^ contains one 
unit of data. The whole data file is divided into a number of 
chunks, and each chunk contains B finite field elements. As 
each chunk of data will be encoded and treated in the same 
way, it suffices to describe the operations on one chunk of data. 
A packet is identified with an element in Fg, and we will use 
"an element in Fg", "a packet" and "a symbol" synonymously. 

We scale the value of B, /3i, (^2, and a, so that they are all 
integers. A chunk of data is represented by a i?-dimensional 
column vector m G F^. The data packet stored in a storage 
node is a linear combination of the components in m, with 
coefficients taken from Fg. The coefficients associated with 
a packet form a vector, called the global encoding vector. 
For i = 1,2, ... ,n, and t > 0, the packets stored in node i 
are denoted by the vector 'h/lf^m, where M^*'* is an a x i? 
matrix. The rows of M^-*'' are the global encoding vectors of 
the packets in node i in stage t. We will assume that the global 
encoding vectors are stored together with the packets in the 
storage nodes. The overhead on storage incurred by the global 
encoding vectors can be made vanishingly small if the number 
of chunks is large. The (n, k) recovery property is translated to 
the requirement that the totality of the global encoding vectors 
in any k storage nodes span the vector space F^. 

The code construction method we are going to describe 
maintains the following property, which we will call the reg- 
ularity property. In the followings, P is a set of transmissive 
vectors with integral components, such that the sum of the 
components of each vector in V is equal to B. 



Regularity Property with respect to V: For t > and for 

each vector h ~ [hi h2 ... hn] in V, if we take the first 
hi rows of m'*' for each i, and putting them together as a 
BxB matrix, then the resulting determinant, denoted by d'^\ 
is non-zero. 

The realization of cooperative repair using linear network 
coding is described as follows. 

Stage 0: For i = 1,2, . . . ,n, node i is initialized by storing 
the a components in M^^"*!!!. 

Stage t: For notational convenience, we suppose without 
loss of generality that node 1 to node r fail at stage t, and we 
want to regenerate them in stage t + 1. 

• Phase 1. For j = 1,2, . . . ,r and i e T-Lt.j, the /3i packets 
sent from node i to node j are linear combinations of the 
packets stored in node i at stage t. For £ = 1, 2, . . . , 
let the ^-th packet sent from node i to node j be 
p^*^M|*'m, where p|*^ is a 1 x a row vector over Fg. 

> Phase 2. Stack the dfii received packets by node j into 
a column vector called uj*\ For ji,j2 G {l,2,...,r} 
and ji ^ j2, node ji sends (32 packets to node ji. For 
^ — 1,2, . . . , (32, the t-\h packet sent from node ji to node 
j2 is qjf,j2,£U^*^ where q^f ^^ j. is a (d/3i)-dimensional 
row vector over Fg 

The (r — l)/32 packets received by newcomer j during 
phase 2 are put together to form an ((r — l)/32) -dimensional 
column vector v^*'. For I = 1, 2, . . . , a, newcomer j multi- 
plies a {dPi + (r — l)/32) -dimensional row vector rj^-* with the 

column vector obtained by concatenating and v^*^, and 
stores the product as the ^-th packet in the memory. 

The vector Pj-*^'s, Qj^j^ ^'s and rj*-''s are called the local 
encoding vectors. The components in the local encoding 
vectors are variables assuming value in Fg. The total number 
of "degrees of freedom" in choosing the local encoding vectors 
is 

N = rdPia + r{r - l)^2(rf/3i) + m(d/3i + (r - 1)^2). 

We will call these N variables the local encoding kernels at 
stage t. We will show that, given the global encoding vectors 
of the n nodes in stage t—1, we can find local encoding kernels 
in stage t such that the [n, k) recovery property is maintained, 
provided that the finite field size is sufficiently large. 

The proof depends on the trellis structure of the modified in- 
formation flow graph defined in the last section. The "transfer 
function" can be factorized as a products of matrices. 

We concatenate all packets in the n storage nodes in stage t 
into an (na)-dimensional vector, and write 



where M^^*^ is the (an) x B matrix 



Mi*' 



Mi*) 
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The packets in stage t can be obtained by multiplying s*^' 
by an (na) x {na) matrix T^*), 

=T(*)s(*-i). (41) 

If nodes 1 to r fails, the matrix T^*) can be partitioned into 



" 


A 





I 



where I is the identity matrix of size {n — r)a x (n — r)a, 
and A is an ra x {n ~ r)a matrix. The entries of A are 
polynomials with the N local encoding kernels at stage t as 
the variables. Moreover, the entries in A have degree 1 in 
each variable. We can see this by fixing all but one variables 
in the local encoding kernels, and then the packets stored in the 
r newcomers' memory are affine functions of the remaining 
variable. 

Applying dTIT l recursively, we obtain 

for t > 0. The matrix T^"^ is an (na) x B rectangular matrix. 
Each entries in T'^*'^ is a variable taking value in F^. Using this 
representation, we see that the rows in M^*^ are rows {i—l)a+ 
1, - l)a + 2, . . . , - 1 in the product t(*)t(*-i) • ■ • T^o). 

We will need the following tool due to N. Alon. By a non- 
zero mutli-variable polynomial, we mean a polynomial, when 
expressed as a summation of monomials, that has at least one 
term with non-zero coefficient. 

Lemma 16 (Combinatorial NuUstellensatz ll32l ). Let 
f{xi,X2,---,XN) be a non-zero multi-variable polynomial 
over Fg of total degree D, which contains a non-zero 
coefficient at xf^ • • • a;^" with Di + D2 + ■ ■ ■ + Dn — D. 
Let 81,82, ■■ ■ , 5jv be subsets of¥q such that \8i\ > Di for 
all i. Then there exist ai g 81,..., ajv S 8n such that 
f{ai, . . .,aN) ^ 0. 

We say that the local degree of a polynomial / is less than 
or equal to £ if the degree of / in each variable is less than or 
equal to £. The Combinatorial NuUstellensatz directly implies 
that if / is a non-zero polynomial in ¥q[xi,X2, ■ ■ ■ ,xn] with 
local degree £, then there is a point (01,02, . ..jUn) £ F^ 
such that /(oi, 02, . . . , oat) 7^ 0. 

We treat the two different types of Pareto-optimal operating 
points, one described in Theorem |9] and one in Theorem [T2I 
separately. 

Pareto-optimal operating point of the first type: We 

let the system parameters be the same as in Theorem [TT] 
Let j be an integer between 2. Consider a linear cooperative 
regenerating code with the following parameters: 

B = k{2d + r-k)-{k- j){k ~j + 1), 
I3i = 2, /32 = 1, 
a — 2{d — k + j) + r — 1, and 
7 = 2(i + r - 1. 

Let Vi be the subset of vectors in Z" which is majorized by 

[ a a a a - 2 a - 4 . . . a - 2(j - 1) ^_^^^]. 

k—j times j i^^ms n — k times 



Let \Vj\ be the cardinality of Vj. 

To initialize the system, we choose the entries in T'°) such 
that the regularity property with respect to Vj holds at stage 0, 
i.e., the determinant d'^^ defined in the regularity property is 
non-zero for all h £ Vj. This is equivalent to choose T^^^ 
such that riheP, ^ 0- Recall that D^°^ is a B x B 

matrix whose entries are B^ distinct variables. We can loosely 
upper bound the local degree of rihGP ^h*^ I'Pjl- 
Combinatorial Nullstellenstaz, we can choose T^^^ such that 
the regularity property is satisfied at t ~ if q > \Vj\. 

Suppose that u'j^ is non-zero for all h G Vj. For each 
h e V-j, we let t[^*^ be the B x (an) submatrix of T^*) 
obtained by extracting the rows associated with h. If the rows 
of T^'^ is divided into n blocks, with each block consisting 
of a rows, then t|j*^ is obtained by retaining the first hi rows 
of the i-th block of rows of T^^*^, for i = 1, 2, . . . , ri. 

The determinant 13^*' can be written as 

D^^*' =det(T[^*^M(*-i)). 

The entries in t[j*'' involve the local encoding kernels to be 
determined, but the entries in M^^*"^) are fixed elements in 
¥q. By Theorem |9] (identifying z with k — j), there is an 
integral flow in the auxiliary graph with input g and output h, 
where g is an integral transmissive vector. This means that 
if the local encoding kernels are chosen appropriately, the 
square submatrix of T^^*'' obtained by retaining the columns 
associated with g is a permutation of the identity matrix, 
while the other columns not associated with g are zero. The 
square submatrix of M*^*^^) obtained by retaining the rows 
associated with g has non-zero determinant by the induction 
hypothesis. We can thus choose the local encoding kernels 
such that d'^^ is evaluated to a non-zero value. In particular, 
dIj*"* is a non-zero polynomial with the local encoding kernels 
as the variables. After multiplying D^^^ over all h G Vj, we 
see that JlheP ^h' ^ non-zero polynomial. 

Each local encoding kernel appears in at most ra rows 
in the determinant d'^\ The local degree of JlheP 
can be upper bounded by ra\Vj\. By the Combinatorial 
NuUstellensatz, we can choose the local encoding vector in 
stage t such that the regularity property will continue to hold 
in stage t provided that 

q > ra\Vj \ = r{2{d - k + j) + r ~ l)\Vj\. 

This proves that the Pareto-optimal operating point of the first 
type can be achieved by linear network coding. 

Pareto-optimal operating point of the second type: Let 

i be an integer between and [k /r\ , and set 

B = k{d + r{i + l)-k)- r^i{i + 1) /2, 
/3i = /32 = 1, 

a = d — k + r{i + 1) , and 
7 = d + r — 1. 

Let Qi be the subset of vectors in Z" which is majorized 
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Fig. 12. Tradeoff between storage and repaii'-bandwidth (d = 21, k = 20, 
B = l). 



by 



[a ... a a 

k—ir times 




By similar arguments as in the previous subsection, we can 
guarantee that the regularity property with respect to Qi is 
satisfied in all stages provided that q > r{d— k + r{i + l))\Qi\. 

Theorem 17. If the size of the finite field q is larger than 

max r{2{d — k + j) + r — l)\V.j\, and 

j=2,...,k 

max r{d — k + r{i + l))\Qi\, 

i=0,...,[fc/rj 

then we can implement linear network codes over Fg for 
functional and cooperative repair, attaining the boundary 
points of Cmf- Thus, 

Cmf = Cad- 

Proof: We have already shown that the corner points of 
Cmf can be achieved by linear network coding. By an analog 
of "time-sharing" argument, we see that all boundary points 
of Cmf are achievable by linear network coding, if the finite 
field size is sufficiently large. Therefore, Cad C Cmf- Since 
any operating point not in Cmf are not admissible (see (flTll). 
we conclude that Cmf = Cad- ■ 

Remark: The requirement on finite field size in the previous 
theorem does not depend on the number of data collectors, and 
does not depend on the number of stages. 

We have the following as an immediate corollary. 

Corollary 18. Given parameters d, k and r and B, the MSCR 

point (7MSCR, ckmscr) achieved by Pi = P2- The MBCR 
point (7MBCR,aMBCR) is achieved by jSi = 2/32- 

In Fig. [12] we plot the tradeoff curves for distributed storage 
system with parameters B — 1, d = 21, k = 20, and r = 



1,3,5,7,9,11,13. The number of storage nodes can be any 
integer larger than or equal to d + 13 = 34. The repair degree 
d is kept constant in the comparison. The curve for r = 1 
is the tradeoff curve for single-node-repair regenerating code. 
Naturally, we have a better tradeoff curve when the number 
of cooperating newcomers increases. In Fig. [12] we indicate 
the Pareto-optimal operating points of the first type by dots 
and operating points of the second type by squares. With one 
exception, all Pareto-optimal points of the second type are the 
MSCR points. 

We compare below the repair-bandwidth of three different 
modes of repair with minimum storage per node. The param- 
eters are n = 7, B = 1, k = 3, and a = 1/3. Suppose that 
three nodes have failed. 

(i) Individual repair without newcomer cooperation. Each 
newcomer connects to the four remaining storage nodes. The 
repair-bandwidth per newcomer is 

Bd 4 
kid+l-k) = 3(4 + 1-3) = ^-^^^^^ 

(ii) One-by-one repair utilizing the newly regenerated node 
as a helper. The average repair-bandwidth per newcomer is 

i , + , + = 0.5741. 

3V3(4+l-3) 3(5+1-3) 3(6 + 1-3)/ 

The first term in the parenthesis is the repair-bandwidth of 
the first newcomer, which downloads from the four surviving 
nodes, the second term is the repair-bandwidth of the second 
newcomer, who connects to the four surviving nodes and the 
newly regenerated newcomer, and so on. 

(iii) Full cooperation among the three newcomers. With 
r ^ 3, Corollary [18] gives the following lower bound in repair 
bandwidth, 

4+3-1 OS 
3(4 + 3-3) 

Finally, we compare the storage efficiency for fixed n, d and 
k, by increasing the number of nodes in cooperative repair For 
MSCR, the storage efficiency is k/n, independent of r. For 
MBCR, the storage efficiency is 

k(2d + r - k) 
n(2d + r- I)' 

If we fix n, d and k and increase r, then the storage efficiency 
increases. 

VII. Two Families of Explicit Cooperative 
Regenerating Codes 

In this section we present two families of explicit con- 
structions of optimal cooperative regenerating codes for exact 
repair, one for MSCR and one for MBCR. The constructed 
regenerated codes are systematic, meaning that the native data 
packets are stored somewhere in the storage network. Hence, if 
a data collector is interested in part of the data file, he/she can 
contact some particular storage nodes and download directly 
without any decoding. Both constructions are for the case 
d = k. We note that all single-failure regenerating codes for 
d = fc are trivial, but in the multi-failure case, something 
interesting can be done when d = k. 
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A. Construction of MSCR Codes for Exact Repair 

We divide the data file into chunks. Each chunk contains B 
elements in a finite field F,. The size of the finite field will 
be determined later. Each node stores r elements in F^. The 
parameters of the cooperative regenerating code are 

d = k, B = kr, n > d + r, 
a = r, 7 = (i + r— 1. 

It matches the MSCR point. 

We need an {n, k) MDS code as a building block. The code 
length is equal to the number of storage nodes. This can be 
furnished by Reed-Solomon code for instance, as long as the 
finite field size q is larger than the total number of storage 
nodes n ||33l . We let G be the n x k generating matrix of 
an (n, k) MDS code over Fg. For i = 1, 2, . . . , n, let the i-th 
row of G be denoted by g^. By the defining property of MDS 
code, every set of k columns of G form a nonsingular matrix. 
We group the kr finite field elements in a chunk of data into 
r /c-dimensional column vectors, mi, m.2, . . . , riir. The data 
stored in the i-th storage node are giirij, for j = 1,2, ... ,r. 
We will treat a finite field element as a packet. 

Suppose that a data collector connects to storage node ii, 
12, . ■ . ,ik- It downloads kr packets gi^nij, for j — 1,2, ... ,r 
and I = l,2,...,k. If we put the k symbols gi^nij, I = 
1,2, ... ,k, together as a column vector, then this vector can 
be expressed as 



St 



.Sik 



m. 



The kxk matrix in the line above is non-singular by the MDS 
property. We can thus solve for nnj. This establishes the {n, k) 
recovery property. 



Suppose that nodes ii, 12, . 



fail. We want to repair 



them exactly with repair-bandwidth d+r — 1 per newcomer In 
the first phase, the r newcomers have to agree upon an ordering 
among themselves. For £ = l,2,...,r, the £-th newcomer 
connects to from any d surviving storage nodes, say nodes 
i^e,i, i^e,2, ■ ■ ■ vi^d, and download ^m^ from node vi^^, for 
X = l,2,...,d. Then newcomer I can decode m£ by the 
MDS property by the end of the first phase. The total packets 
generated is rd. 

In the second phase, newcomer iy computes and sends 
gi^nii to newcomer i^, for x ^ y. Newcomer stores 
gi^nii^ at the end. A total of r(r — 1) packets are required 
in the second phase. The r newcomers are regenerated with 
repair-bandwidth per newcomer equal to d + r — 1. 

B. Construction of MBCR Codes far Exact Repair 

The parameters of the cooperative regenerating code are 

d = k, B = k{k + r), 

n = d + r, a~^^2d + r — 1. 

It matches the MBCR point. 

Each chunk of data consists of B = k{2d + r — k) = kn 
data packets, considered as elements in GF{q). In each chunk 



let the kn data packets be xq, xi, . . . , Xkn-i- We divide them 
into n groups. The first group consists of xo,xi, . . . ,Xk-i, 
the second group consists of Xk, Xk+i, . . . , X2k-i, and so 
on. For notational convenience, we let the column vec- 
tor x^- = a;(j_i)fe+i ... represent the 
data packets in the j-th group {1 < j < n). (We use superscript 
to denote the transpose operator) 

For i = 1,2, ... ,n, we construct the content of node i as 
follows. We first put the k data packets in the i-th group 
into node i and then n — 1 parity-check packets 



Vl • Xi^i, V2 • Xi(52, • ■ • , V„_i • Xj0(„_i) 

into node i, where "•" is the dot product of vectors and © is 
modulo-n addition defined by 



>y:= 



X + y if X + y < n, 

X + y — n if X + y > n. 



Here Vj (j = 1,2, . . . ,n — 1) are row vectors in a (n — 1) x 
k generating matrix, G = [vi"^ V2"'" ... v„_i-^]-^, of an 
MDS code over GF{q) of length n — 1 and dimension k. By 
the defining property of MDS code, any k columns of G are 
linearly independent of GF{q). 

As for the file reconstruction processing, suppose without 
loss of generality that a data collector connects to nodes 1, 
2, . . . ,k. The systematic packets xq, xi, . . . , Xk^-i in the first 
k groups can be downloaded directly, because they are stored 
in node 1 to node k uncoded. The j-th group of data packets 
(j > k) (the components in vector Xj) can be reconstructed 
from Vj_i • Xj, Vj_2 • Xj , . . . , Vj_fc • Xj, by the MDS property. 
A data collector connecting to any other k storage nodes can 
decode similarly. 

As for the cooperative repair processing, suppose without 
loss of generality that nodes fc + 1 to n fail at the same time. 
The repair process proceeds as follows. 

Step 1: For i — 1,2, ... ,k, node i computes ^n+i-j •x,; and 
sends it to newcomer j, for j = k + l,k + 2, . . . ,n. 

Step 2: For j — k ~\- l,k + 2, . . . ,n, newcomer j downloads 
k packets Vj_i • x^, Vj_2 • Xj, . . . ,yj-k ■ Xj from 
nodes 1 to k. 

Step 3: For j = k + 1, k + 2, . . . ,n, newcomer j can solve 
for the systematic packets in Xj . Then node j sends 
Vj_|_i-Xj to node n — i + 1, for i = 1, 2, . . . , n—j, and 
sends v,; -Xj to node j — i, for i = l,2,...,j — fc — 1. 



In steps 1 and 2, a total of 2k{n — k) = 2kr packets are 
transmitted. In step 3, each newcomer transmits r — 1 packets. 
The total number of packets required in the whole repair 
process is 2kr+r{r — l) = r{2d+r — l). The repair-bandwidth 
per newcomer is therefore 2d + r — 1 packets. 

We illustrate this MBCR constructiion by an example. The 
parameters are n — b, k — d = 'i and r = 2. A file is divided 
into i? = 15 packets, each containing equal number of bits. 
Let the packets be xi, X2, . . . , a;i5. Each node stores a = 7 
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Node 












1 




X4 


XS 


X12 


Xl3 + X14, + Xi5 


2 


Xi + X2 + X3 


X4. , X^ , Xq 


X-J 


Xll 




3 


X3 


X4 + X5 + xe 


XT,Xg,,XQ 


2; 10 


Xi4 


4 


X2 


XQ 


X-J + XS+ Xg 


a;io,3;ii,xi2 


Xl3 


5 


Xi 


^5 


xg 


1^10 + 3^11 + ^12 


a;i3,a;i4,xi5 



TABLE I 

An MBCR code for d = 5, k = 3 and r = 2. THE PACKETS IN EACH ROW ARE THE CONTENT OF THE CORRESPONDING NODE. 



Nodes 

X10, Xl4, X3, 
X7, Xb, Xg, 
X4+X5+X6 



Node 4 



Node 5 



Xl3, X2, X6, 
X10, X11, X12, 
X7+X8+X9 



Xi, X5, Xg, 
Xl3, Xi4, Xi5, 
X10+X11+X12 




X8, X12, 
X1, X2, X3, 
X13+X14+X15 

Newcomer 1 



X1+X2+X3 



X7, Xii,Xi5, 
X4, X5, Xg, 

Newcomer 2 



Fig. 13. Cooperative recovery of nodes 1 and 2. 



1 and sends and X15 to newcomer 2. The contents of the 
two newcomers after the first phase of recovery are shown in 
Fig. [13] Notice that newcomer 1 is able to compute 2:3 by 
adding xi and X2 to the parity-check packet xi + X2 + x^. 
Similarly, newcomer 2 computes X4 from the three packets 
X5, xe and 2:4 + 2:5 + xq. 

In the second phase, newcomer 1 sends xi + X2 + xs to 
newcomer 2, and in return, newcomer 2 sends X4 to new- 
comer 1. The data exchange in the second phase is indicated 
by the dashed arrows in Fig.[T3] This completes the joint repair 
of node 1 and 2. The number of packet transmissions per 
newcomer is equal to 7 = 7. 

Any double failures can be recovered similarly with repair- 
bandwidth per newcomer equal to 7. This is indeed optimal 
by Corollary [18] 



packets. We use the following matrix 

"1 0' 

1 

1 

1 1 1 

over F2 as the generating matrix of the MDS code. The content 
of the storage nodes is shown in Table. Il 

Each node contains six uncoded packets, and one parity- 
check packet. The addition in the above table is component- 
wise binary addition. In this example the parity-check packets 
are computed simply by exclusive-OR (XOR). For example 
in node 1, we XOR packets 2:13, xu and X15 to obtain the 
parity-check packet. We note that there is a cyclic symmetry 
in the encoding. The indices of the packets in node 2 can be 
obtained by adding 3 to the corresponding indices in node 1 
and reduce modulo 15. From node 2, we can add 3 modulo 
15 to the packet indices and get the packets in node 3, and so 
on. 

It can be verified that it satisfies the (5, 3) recovery property. 
We note that his code is not MDS, as each node stores more 
than 15/3 = 5 packets. 

Suppose that node 1 and node 2 fail. Newcomers 1 and 2 
regenerate the failed node by downloading some packets from 
the other d = 3 surviving storage nodes. The packets sent 
from the surviving nodes to the two newcomers are shown in 
Fig. [13] Each surviving node transmits two packets to each of 
the two newcomers. Node 3 sends 2:3 and xs to newcomer 1, 
and sends xj and 2:4 + 2:5 + xq to newcomer 2. Node 4 sends 
xi +X2 + 2:3 and 2:12 to newcomer 1 and sends xq and 2:11 to 
newcomer 2. Node 5 sends xi and Xi3+2;i4+2;i5 to newcomer 



VIII. Concluding Remarks 

One of the technical difficulties in this paper is to deal 
with an unbounded information flow graph. The number of 
sink nodes may be infinite. By exploiting the structure of the 
information flow graph, we show that the points on the fun- 
damental tradeoff curve can be attained by linear cooperative 
regenerating codes. Furthermore, we can work over a fixed 
finite field for arbitrarily large number of repairs. 

The proof technique invokes a concept called "submodular 
flow" from combinatorial optimization, which can be regarded 
as a generalization of the max-flow-min-cut theorem. This 
mathematical tool has also been used in |[34l to determine 
the capacity of linear deterministic model of relay networks. 
Indeed, the information flow graph studied in this paper share 
some common features with the linear deterministic model for 
wireless relay network as proposed in f35|. A closely related 
notion, called "linking system", finds application in perfect 
secret key agreement ll36l . Il37l . and in wireless information 
flow ||38l, Il39l. 

Recently, constructions of cooperative regenerating codes 
achieving the MBCR point are given in Bol and ll^Tl . Further 
explicit construction for other parameters is an interesting 
research problem for future studies. 

Acknowledgements 

We would like to thank Chung Chan and Frederique Oggier 
for useful discussions, and the anonymous reviewers for their 
careful readings. 



20 



Appendix A 

The MSCR Point under Non-Homogeneous Traffic 

Homogeneous traffic is assumed in the main text of this 
paper; a newcomer downloads equal amount of data from the 
d surviving nodes, and each pair of newcomers exchange equal 
amount of data. We show in this appendix that the assumption 
of homogeneous traffic is not essential at the minimum-storage 
point, i.e., the repair-bandwidth cannot be decreased when a = 
B/k if property of homogeneous traffic is relaxed. 

Suppose that each newcomer connects to d surviving nodes, 
and the total number of packets transmitted in the first phase 
be rdPi. The average number of packets per link is thus /3i. 
Likewise, we let the total number of packets in the second 
phase be r(r — 1)^2, so that the average number of packets 
per link is (32- We call this the non-homogeneous traffic 
model. Naturally, it contains the homogeneous traffic model 
as a special case if each newcomer download /3i packets per 
each link in the first phase, and /32 packets per each link in 
the second phase. In the non-homogeneous model, it is only 
required the traffic in the first (resp. second) phases of all 
repair processes are identical. 

Theorem 19. Under the non-homogeneous traffic model, the 
average repair-bandwidth per newcomer is lower bounded by 
B{d + r — l)/{k{d + r — k)) at the minimum-storage point, 
i.e., B/k — a. 

Proof: Consider the scenario where nodes 1 to r fail in 
stage 1. Suppose that each newcomer connects to nodes r + 1, 
r + 2, . . . ,r + d. We will derive a lower bound on the repair- 
bandwidth. 

For i = r -\- l,r -\- 2, . . . ,r + d and j — 1,2, ... ,r, let the 
capacity of the Unk from surviving node i to newcomer j be 
(3i{i,j). Let the capacity of the link from \r\j^ to Midj^, for 
ji 7^ j2 is P2{ji,j2)- The average link capacities in the first 
and second phase can be written respectively as 

ii ?^J2 

The repair-bandwidth per newcomer is thus 

^ r-\-d r ^ r r 

- E E^i(*'j')+-E E /?2(ji,j2) = d^i+(r-i)^2. 

i=r+l j=l j2=l Jl = i 

We consider a set of data collectors. Each data collector 
in this set connects to one of the r new nodes which are 
regenerated in stage 1, and k — 1 nodes among nodes r + 1 
to r + d. Let DC be such such a data collector, connecting to 
node j, for some j £ {1, 2, ... , r}, and node 11,12, ... , ik-i S 
{r + 1, r + 2, . . . , r + d}. Consider the cut (W, VV), with 

W = {DC, Irij, Midj, Outj, Out,,, Out,,, . . . , Out,;,_ J. 

An example is given in Fig. [141 with VV drawn in shaded color 



Stage Stage 1 




Fig. 14. A DC connects to one storage node among the first r nodes and 
fc — 1 storage nodes among the nodes r + 1 to r + ci. 

There are distinct choices. Each choice yields a cut- 

set upper bound on the file size B. If we sum over all 
such inequalities, we obtain 

^ / i=r+l j=l 

\ / J2=l 31=1 

31^02 

The first term on the right-hand side of the inequality comes 
from the fact that each of the j^) inequalities contributes 
{k — l)a. For the second term, we note that the link from node 
i to node j, (for r + l<i<r + d, l<j<r) contributes 
to the summation if node i G W and node j e VV. 
There are (^ij) choices for the "out" nodes to be included 
in VV. Hence for each i, j, the term f3i{i,j) is multiplied by 
{'lr\)- By similar argument we can obtain the third term. 
After dividing both sides by '''(j.f j^), we obtain 

B<{k-l)a + {d-k + l)Pi + {r-l)^2- (42) 

In the rest of the proof we distinguish two cases. 

Case I: k > r. Consider the class of data collectors who 
download from nodes 1 to r, and k — r nodes among nodes 
r + 1 to r + d. For a data collector DC, say connecting to 
nodes 1 to r, and ii, 12, ... , ik-r G {r + 1, r + 2, . . . , r + d}, 
we compute the capacity of the cut (W, VV) with VV specified 
by 

r 

VV = {DC, Out,,, Out,,, . . . , Out,,_ Ju|J{ln,, Mid,, Out^}. 

If we sum over the (j.^* ,,) inequalities arising from these cuts, 
we get 

^ / i=r+l j = l 
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Upon dividing both sides by (j,^^^), we obtain 

B < [k - r)a + {d- k + r)r/3i . (43) 
With a = B/k, we infer from (|42]i and (gSll that 

B{d + r - 1) 



d/3i + (r - l)/32 > 



(44) 



fc((i + r - fc) ■ 

Case 2: k < r. Consider the class of data collectors who 
connects to k nodes among nodes 1 to r. To a data collector 
DC connecting to ji, j2, jk G {1, 2, ... , r}, we associate 
it with the cut (W, W) with VV given by 

fe 

W = {DC} U U {In,, , Mid,, , Out, J. 
The sum of the (T) resulting upper bounds on B is 



B 



= 1 j=r+l 



r - 1 



1,J2) 



j2 = l Jl = l 



After dividing both sides by (^), we get 

B < kdi3i + k{r - k)j32. (45) 

From (l42T l and ( |45] |. we can deduce (l44b as well. We conclude 
that in the non-homogeneous traffic case, the repair-bandwidth 
per newcomer cannot be smaller than B{d + r ~ l)/{k{d + 
r — k)) when a = B/k. ■ 

Appendix B 
Proof of Lemma[To] 

We first show that it is sufficient to verify that the condi- 
tion ([39]l in Lemma [TOl holds for subsets S in the form of 



( |J{ln„Mid„OutJj U ( |J{i;,,Out,}j, (46) 

where ^ is a subset of {1,2, ... ,r} and B is a subset of 
{r + l,r + 2,...,n}. 

An example of a subset S in the form of ( |46] | is illustrated 
in Fig. M 

(a) Suppose for some j G {r + l.r + 2, . . . , n}, S contains 
Vj but does not contain Out^. Then the directed edge e = 
(vjjOutj) is in A^"', and make a contribution of hj to the 
term lb(A°"*). But the inequality lb(A^"*) -ub(A^") < p{S) 
holds if and only if 

lb(Ar ) - ub(A-) - h, < piS) - h, 
^ lb(A°t?{Out,}) - ub(A- {Out,}) < PiS U {Out,}) 
holds. 

An analogous argument shows that if S contains Out, but 
does not contains Vj for some j G {r + l,r + 2, . . . , n}, then 
the vaHdity of lb(Ag"*) - ub(Aj;") < p{S) is equivalent to 

lb(A^"*) - ub(Aj;") + < p{S) + /ij 

^ M^sWout,}) - M^S\{Ou,,}) < PiS \ {Out,}). 



Hence it is sufficient to consider subset S which either contains 
both Vj and Outj, or none of them. 

(b) Suppose that S contains Midi for some i G {1, 2, . . . , r}. 
Since the link from In^ to Midi has infinite upper bound, the 
left-hand side of (3% is equal to — cx) if In^ is not included in 
S. Then the inequality in (3% holds trivially. We can assume 
without generality that In^ G 5 if Midi G S. 

On the other hand, if Outi is not included in S, the 
inequality lb(Ag"*) - ub(Aj^') < p{S) is implied by 

M^su{Ouu}) - M^su{Ouu}) < PiS U {Out,}), 
because 

lb(Ar ) - ub(A^") ^ lb(A-\out.}) - ub(A- {Out.}) 
<p(5U{0uti}) 
- PiS) - < PiS). 

We can thus assume that if Midi is in S then Outi are also 
in S. 

(c) Suppose that for some i G {1, 2, . . . ,r} S contains Irii 
but does not contain Midi and Out^. In this case the inequality 

lb(A^''*) - ub(A|') < p{S) is imphed by 

lb(A5"*||y||j. Outi}) ^ ub(A™^||yy. Out.}) 
<p(5U{Mid,,0ut,}), 

because 

lb(A™*) - ub(A™) = lb(A°"J{Mid..Out.}) - ub(A™ {„„__out.}) 
< p(5U{Mid„0ut,}) 
- p{S) - h, < PiS). 

(d) Suppose that S contains Outi but does not contain 
Midi for some i G {1, 2, . . . ,r}. In this case, the inequality 
lb(A°"*) - ub(A™) < p{S) is impHed by the following two 
inequalities 

lb(A-(*{Out.}) - M^s\{Ouu}) < PiS \ {Out,}), (47) 

lb(A?;^*ut.}) - M^To^u}) < P({Out,}). (48) 

(We can add ( |47] ) and ( l48T l to obtain the inequality in ([39l).) 
The inequality in ( |48] ) is simply equivalent to —a < —hi, 
which holds by the assumption on h. Hence, the checking of 
( [39l ) amounts to the check of ( |47] i. 

Paragraphs (b), (c) and (d) above imply that we can consider 
S which contains either none of IPi, Midi and Outi, or all of 
them. This completes the proof of the claim. 

Now, we prove that the inequality in ( [39l ) is valid for subset 
S in the form of ( |46l ). We let the cardinality of A and Bhe a 
and b respectively. Obviously we have a < r and b < n — r. 

In the following we will use (a;)+ as a short-hand notation 
for max(0, x). 

Because lb(Ag"*) = 0, we have 

lb(A°"*) ~ ub(A™) = - ub(A^") 

< -a((rf-6)+/32 + (r-a)/3i) 
= -a{2{d-b)+ + {r-a)). 

It suffices to show that 

-a{2{d - b)+ + (r - a)) < 6^ - h{A) - h{B) = (t{S), 
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As h(^) + h.{B) is less than or equal to 9a+b by hypothesis, 
it is sufficient to show that 



-a{2{d - b)+ + r ~ a) < 



or equivalently 



for X — 0,1,2, ... ,n. (The summation is defined as if the 
upper limit is negative.) The sum of the components in (|40] i 
is equal to B, 

i I 

{k — lr)a + r ^^(a — ir) = ka — i 



9a+b - Ob < a{2{d - b)+ + r a) 



(49) 



1=1 

,2^(^+1) 



We prove the asserted inequality in (|49] l by distinguishing 
three cases. 

Case 1, a + b < k: We first note that for j < k, we have 



a if < j < z 

a - 2{j - z - 1) if z < j < k 



= ka — r 

2 

= B ^ 9k, 

and the difference between consecutive <j)x is equal to the x-th 
components in ( l40b . 



< a - 2(j - z - 1) for < j < fc. 
Hence, for < j < fc, we have the following upper bound 'fix — fx-i = < 
Oj-Oj^i < 2{d- z) + r - 1 - 2{j - z - 1) = 2{d-j)+r+l. 



Summing the above inequality for j from b + I to a + b, 
we obtain 

a+b 

9a+b — Ob — ^ Oj — 9j-i 

3=b+l 
a+b 

< [2{d--j)+r + l] 

j=b+i 
= a{2{d-b)+r-a) 
= a{2{d-b)+ +r-a). 

We have use the assumptions that d > k and fc > & to derive 
the last equality. 

Case 2, b < k < a + b Since 9^ = Ok+i = ■ ■ ■ = 9a+b in 
this case, we have 

9a+b — 9b = 9k — 9b 

< {k - b){2{d ^ b) + r - ik - b)) 

< a{2{d-b) + r-a) 
= a{2{d-b)+ +r-a). 

The first inequality is obtained from the previous case. 

Case 3, k < b: ( l49l l holds because 9a+b — 9b = on the 
left-hand side, while the right-hand side is non-negative. 

This completes the verification that condition (3% in 
Lemma [TO] holds. 



Appendix C 
Proof of Theorem[T2] 

Proof: (sketch) Let Q be the set of vectors in Z" which 
are majorized by the vector in (l40l i. We can check that the 
vectors in Q belong to the polymatroid associated with rank 
function f{S) = "y^i^i, where ip^ is defined by 



'■— min(fc, x)a 



min{k,x) — k+^r 

E 

1=0 



[j/r] 7 



(50) 



a \f < X <k ~ ir 

a ~ r if k — £r < X < k — {£ — l)r 

a-2r if k-{e~l)r <x<k-{e~2)r 

a ^ £r if k — r < X < k 

if fc < a; < n 



for X ~ 1,2, ... ,n. 

The proof is along the same line as in the proof of 
Theorem |9l We define the same submodular function on the 
vertex set V" of this auxiliary graph, except that /3i is equal 
to 1 and a is equal to c? + r(£+ 1) — fc in this proof. We want 
to show that the inequahty lb(Ag"*) - ub(A^') < a{S) holds 
for all subsets S which is in the form of ( |46] ). 

Analogous to ( |49l ), we need to show that 



Va+b - ^b < a{{d - 6)+ + r - a). 



(51) 



for < a < r and < 6 < d. 

As in the proof of Theorem |9] we consider three cases. (1) 
a + 5 < fc, (2) 6 < fc < a + 6 and (3) fc < 5. 

Case (1) a + b < fc. Note that the difference (pa+b — fb, i-C-, 
the x-th component in ( fSTI i can be written as 



a 



X — k + £r 



(52) 



We need to take the sum of ( |52] | for b < x < b + a. Note 
that the value of (l52T i is constant for r consecutive values of 
X. Since a is no larger than r, \{x — k + tr)/r] assumes at 
most two values for b < x < b + a. We further divide into 
two cases. 

First subcase: \{x — k+ir)/r \ is constant for 6 < a; < b+a. 
For X in this range, \{x — k + ir)/r~\ is equal to \{a + b ~ 
k + ir) /r] . We have 



a + b-k + ir- 



a{{d - 6)+ + r - a) - {ipa+b - <Pb) 

a+b 

^ a{{d~b)+ +r ~ a) - ^1 (" " 

x=b+l 

> a(d -b + r- a-a+{a + b-k + ir)^ 
= a{d+{£ + l)r-k-a) =0. 



Second subcase: \{x — fc + £r)/r~\ is not constant for b < 
X < b+a. Suppose that a + b > k — £r + £^r and b < k — £r+^r 
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for some integer ^. We have 

' X — k + £ri I £ for b<x<k — £r + ^r 

for k — £r + S^r < X < a + b. 

For the ease of presentation, we use S to stand for a + 6— (fc — 
£r + £r), and let F be d~b + r — a. S is positive construction. 
With these notations, we get 

a{{d - 6)+ + r - a) - {tpa+b - Vb) 
>aY- (ipa+b - y^b) 

= aY -{a- d){a - £r) - S{a - (£ + l)r) 
= {a- 5){Y ^a + ^r)+S{Y -a + {^ + l)r) 



Since 

Y -a 
we get 



^r = d-b^ 



— a — d — r£ — 



k + £,r = -S, 



a((d - 6)+ + r - a) - (ipa+b - Vb) 
> -{a-S)S + 5{r-S) 
=^5{r-a) >0. 

This proves case (1). The proofs of case (2) and case (3) 
are similar to those in Theorem |9] and are omitted. 

The proof proceeds by applying Theorem [3] to recursively 
construct flow on the modified information flow graph. ■ 

Appendix D 
Proof of Theorem[T5] 

We first prove two theorems which will be useful later. The 
first one is about the envelope of straight lines. 

Lemma 20. Let y = rriiX + hi for i = 1,2,...,N, be 
N straight lines in the x-y plane, satisfying the following 
conditions: 

(a) The slopes are negative with decreasing magnitudes, i.e., 

—mi > —7712 > ■ ■ ■ > —iriN > 0. 

(b) For j = 1,2, . . . , N — 1, the x-coordinates of the 
intersection point of y — rrijX + bj and y — TOj+ix + foj+i, 
denoted by Xj, are strictly increasing, i.e.. 



Then we have 



max (rriiX + 6,1 
j=i,...,N ■' ■' 



Xi < X2 < • • ■ < XN-1- 



' mix + bi for < x < xi 
rrijX + bj for Xj^i < x < Xj 
j = 2,3,...,iV-l 
^mj^x + bjM for X > xjM^i. 



We notice that Xj is equal to {bj — 6j+i)/(mj_|-i — rrij), and 
is positive by the assumptions in (a). 

Proof: Since the slope of Li is more negative then the 
slope of I/j+i, for i = 1,2, . . . , N — 1, we see that 

niiX + bi > rrii+ix + 6,:+i for x < Xi 
rriiX + bi = rrii^ix + 6,:+i for x ^ Xi 
rriiX + bi < rrii+ix + 6,:+i for x > Xi. 



For X between and xi, we have x < Xi for all i. Hence 

rriix + bi > iTijX + bj for j ^ 2,3, . . . , N — 1. Therefore, 

max (miX + 6,;) = mix + 61 

i=l,...,N 

for X < xi. 

Consider x in the interval [xi^i,Xi), for some i = 
2,3, ... ,N — 1. Since x > Xi^i > Xi^2 > ■ ■ ■ > xi, we 
get 

rriiX + bi > rrii^ix + 6,:_i > • • • > mix + bi. 
On the other hand, since x < Xi < ■ ■ ■ < xn-i, we get 

rriiX + bi > rrii^ix + bi > ■ ■ ■ > raj^x + bj^. 
Therefore 

max (miX + 5o) — niiX + bi 

j=l,...,N ■' 

for <x<Xi. 

The proof of the last case x > xn-i is similar and omitted. 

■ 

The second lemma is a special case of duality and gives a 
sufficient condition for optimality in linear program. 

Lemma 21. Consider the linear programming with objective 
function cix + C2y, where x and y are variables and ci and 
C2 are constants, subject to constraints anx + ai2y > h, for 
i — 1,2, ... ,N, and x,y > 0. If {x, y) is a point which 

(a) satisfies all constraints, 

(b) attains equality in two particular constraints whose 
slopes are distinct, and the slope of the objective function is 
between these two slopes. 

then {x,y) is the optimal solution to the linear programming 
problem. 

Proof It suffices to show that the value of the objective 
function cannot be smaller than cix+C2y without violating any 
constraints. By re-indexing the constraints, suppose without 
loss of generality that {x, y) satisfies equations anx + ai2y = 
bi for i — 1,2, and 



aii/ai2 > 021/022, 
aii/ai2 > ci/c2 > 021/022- 
Let A be the matrix 

A := 

The matrix A is invertible by (l53T l, and we have 
A 



(53) 
(54) 



an ai2 
0.21 0,22 



X 




'bi' 


y. 







Let [p, q) be the solution to 

[p A = [ci C2] . 
We can solve for p and q and see that the solutions 



Cia22 - 02021 
O11O22 — O12O21 



C2O11 - ciai2 
011O22 — O12O21 
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are non-negative by the assumption in d54l ). If {x, y) is any 
feasible solution, then it satisfies 



X 


> 


'bi 






y. 




62 



with the comparison ">" carried out component-wise. By 
multiply both sides by [p q\ , we obtain 



[p q] A 



> 



[P q] 



cix + C2V > [p q] AA ^ 
= cix + C2y. 
This proves the optimal value is indeed cix + C2y. 
Proposition 22. For j = 1, 2, . . . , fc, ;/ 



then 



7*(a) 



d r-1 

(2d + r - - {k-j)a) 
j{2d-2k + r + j) 



> 



(55) 



for Q!j_i < a < aj. 



Also, we have 



J* (a) 



2d + r - 1 



k{2d + r~k) 

for a > ctk-i- Indeed, If k < r, the condition in (I55l l is 
satisfied for j = k. If k > r, (I55l l is satisfied for j between 
[fc/rjr and k. 

(Recall that 7* (a) is defined as the minimum value of dl3i + 
(r — l)/32 subject to the constraints ( |23] | and (l24l l. for s = 
l,2,...,fc, and /3i,^2 > 0.) 

The proof of Prop. l22l will be divided into several lemmas. 

Lemma 23. Let m be a non-negative integer and j be an 
integer in the range 1 + mr, 2 + mr, . . . , (r — 1) + mr. Then 
the magnitude of the slope of Lj (a), 

j{d-k) + {f+A,^r)/2 

is a concave function of j for j between 1 + mr and (r — 1 ) + 
mr. 

Proof: Consider integer j which can be written in the 
form j = i + mr, for < i < r. Then, Aj^r = rnr"^ + i^. The 
slope of the line Lj{a) has magnitude 

{i + mr){d — fc) + {{i + rnr)^ + mr^ + i^)/2 
[i + mr)r — mr^ — 
_ 21^ + 2i{mr + {d - k)) + mr{mr + r + 2{d - k)) 
^ 2i{r - i) 

i + r + mr + {d — k) mr{mr + r + 2{d — k)) 



= -1 



r ^ i 2i{r — i) 

r + mr + (rf — fc) mr{mr + r + 2{d — k)) 



r — i 2i{r — i) 

Each term in the above line is a concave function of i. 
Therefore the sum of them is also concave. ■ 



Definitions: For j = l,2,...,fc, let gj{a) be the /32- 
coordinate of Pj{a), i.e.. 



B~{k- j)a 
j{2d-2k + r + jY 



and let 



/32(a) := max gj{a). 

l<j<k 



The above notations are illustrated in Fig. [15] We notice 
that P2 [a) is a decreasing and piece-wise linear function. 

Lemma 24. 

1) For j = 1, 2, . . . , fc — 1, we have gjijxj) = gj+i{aj). 
2) 

B 2d + r-l 

-<a,<a2<---<a,^,^B^^^^-^—^^. 

3) /32(a) is a piece-wise linear function of a. In fact, /32(q!) 
can be expressed in a case-by-case manner as 

gi{a) for a < ax 
/32(a) = { gj{a) for < a < 

J=2,3,...,k. 

Proof: For j = 1, 2, . . . , fc — 1, we can solve gj{a) = 
gj+i{a), i.e., 

B-{k-j)a _ B-(k-j-l)a 
j{2d-2k + r + j) ~ {j + l){2d - 2k + r + j + 1) 

for a. After some algebraic manipulations, we can check that 
a — ctj is the solution. 

For the second part of the lemma, we let A be a short-hand 
notation for k{2d — k + r) in this proof. We will compute 
the difference between two consecutive terms in the sequence 
{'^i)i=i ^iid show that the differences are positive. For j — 
1, 2, . . . , fc — 2, we have 

B B 

_ 2{d -j)+r + l _ 2{d-j-l) + r+l 

A-j(j-l) A-{j + l)j ' 

which can be simplified to 

^ (fc-j)(2d + r-fc-j) 



(A-j(j-l))(A-j-(j + l)) 

Using the assumption that d > k > j, we can see that the 
numerator and the denominator are positive. Thus, we get 
ctk-j > ctk-j-i for j = l,2,...,fc — 2. This proves that 

ai < a2 < ■ ■ ■ < 5/c-i- 

To complete the proof of the second part, we check that 

2d-2k + r + 3 



ai^ B 



k{2d-k + r)-{k-l){k-2) 



B 2d-2k + r + i B 
'k2d-2k + r + 3-2/k ^ T' 



and 



ak-i = B 



2d + r-l 
k{2d + r-k)' 
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B/k a. 



Fig. 15. The function /32(a) is the maximum of several affine functions. 
(The parameters in this example are d = 8, k = 6 and r = 3.) 



We will apply Lemma |20] to prove the third part. We have 
already verify part (b) of Lemma |20] For the condition in part 
(a) of Lemma |20l we check that the magnitude of the slope 
of the straight line y — gj{a) in the a-y plane is 

{k^j)/{j{2d + 2k + r + j)). 

When j increases, the numerator decreases and the denomina- 
tor increases. Hence the magnitude of the slope is a decreasing 
function of j. ■ 

Proof of Prop. \22\ Consider a in the interval [fij-i, aj). 
Suppose that the condition in (ISST i is satisfied. We will show 
that 

(/3i,/32) = (2g,(a),g,(«)), 

which corresponds to the the point Pj{a) in the /3i-/32 plane, 
is the optimal solution to the linear programming problem 
in dlTll. 

First of all, the point Pj (a) satisfies the inequalities in (l23l l 
and (|24] | for s = j by construction. To show that it also 
satisfies (l23T l and ( |24] | for i ^ j, we use the property that the 
slope of the linear constraints in ( l23T l and (|24] | are negative or 
infinite. If P is a point in the /3i-/32 plane which satisfies (l23T l 
(resp. (l24l i) with equality and P' is another point in the /3i-/32 
plane such that P' ^ P, then the point P' also satisfies (|23] | 
(resp. (l24l i). For a e [aj-i,aj), we have /32(q:) = by 
Lemma l24l This implies that Pj{a) Pareto-dominates Pi{a) 
for all i j in the /3i-/32 plane. As Pi(a) satisfies (l23l l and 
(l24l l for s = i, we conclude that Pj (a) is a feasible solution. 

Because the condition in (l55T l is satisfied by assumption, the 
magnitude of the slope of Lj{a), 

j{d-k) + {f+A,,r)/2 

jr - Aj^r 

is larger than or equal to d/{r — 1). On the other hand, the 
magnitude of the slope of Lj{a), 

d-k + {j + l)/2 
r - 1 



is strictly less than d/{r — 1). By Lemma 1211 Pj{a) is the 
optimal solution to the linear program in ( |27] ). Thus 

B-{k- j)a 



7*(a) = {2d + r~l)gj{a) = (2d + r-l) 



j{2d-2k- 



-J) 



for dj-i < a < aj. 

For the second part of the proposition, let Q and R be 
respectively the quotient and remainder when k is divided by 
r, i.e., k — Qr + R with < i? < r. We need to show that 
the determinant in (ISST i is non-negative. 

\f k < r, then Q = and Afc ,, = fc^. For j = k, the 
determinant in ( fSSl ) can be written as 

= - 1) > 0. 



kd 
d 



kr 
r - 1 



We thus see that the condition in ( ISSl l is satisfied in this case. 

If fc > r, then Q >\. Consider an integer j between [fc/r J r 
and fc. We can write j — Qr + R' for some R' between 



and R. We have A, 



and jr-Aj vr = R'{r-R') 



The determinant in ( |55] ) becomes 

j(d-A;) + (j2+A,,,)/2 

d r - 1 

If i?' = 0, then the above determinant is equal to 

(r-l)[j-(rf-fc) + (/+Aj,.)/2], 

which is positive. For 1 < R' < R, this determinant can be 
lower bounded by 

jid^k) + {f+Aj,r)/2 R'{r-l) 



1 



(r 
(r 



mj- 



- fc) + + Aj,^)/2 - 
i?')(d-fc) 



if 



A, 



.)/2 

1„2 



-R'k] 
2R'{R 



R') 



= {r-l)\{3-R'){d-k) + 

Using the fact that geometric mean is less than or equal to 
arithmetic mean, we can further lower bound it by 

Q2^2 ^ g,.2 _ ^2/2. 



(r-l)[(j-i?')(rf-fc) + 
> [r - - R'){d - k) + r"^ 



Q-l/2 



> 0. 



In the last inequality, we have used the fact that d > k, j > 
R' , and Q > 1. Therefore, the condition in ( fSSl l is satisfied 
for [fc/rjr < j < k. This proves that 7* (a) = {2d + r — 
l)/(fc(2d + r-fc)) for a > afc_i. ■ 
We are now ready to cover the remaining cases which is 
not covered by Prop. |22] Recall that the magnitude of the 
slope of the line Lj{a) is denoted by fij = {j{d — k) + 
Aj,r). Suppose that there is an integer 



A 



j,r)/2)/{jr 



if _ 

i between 1 and fc such that jji < ^7^. Let £ be the quotient 
when i is divided by r. Because fij is a concave function of 
j for j between £r and {£ + l)r (by Lemma |23] part l3ll. we 
can find an index jo such that 



mm /i," 

er<j<{e+l)r 



(56) 



Let b be the remainder when jo is divided by r, so that jo = 
£r + b, and < 6 < r. Let ji be the smallest integer larger 
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than or equal to jo such that fij-^ < d/{r ~ 1), and let j2 
be the largest integer smaller than or equal to jo such that 

Mj2 < d/{r - 1). 

Definitions: For i = 0, 1, 2, ... , [fc/rj, define 

5^ := ^(d + r(^+l)-fc), (57) 

We note that a'^^^ B/k. 

Proposition 25. 

1) The integers ji and j2 ^''^ well-defined, and they satisfy 

< j2 < jo < ji < min(fc, r{e + 1)). 

2) The value of a'^ satisfies 

- 1 



^Jo-l 



3) 



In particular, we have 



< 



B 



< a 



Jo- 



7*(i?/fc) = B 



{d + r-1). 



d + r-l 
k{d + r ~ k)' 



4) Let 



ci{j) :=j(c«-fc) + (i' + A,,.)/2, 



C2O') ■■= jr 
Let A ^je f/ze matrix 

Cl(il+1) C2(jl + 1) 



(58) 
(59) 



A := 



ci(ii) 



C2(jl 



where Ci{j) and C2{j) are defined in the paragraph 
before Theorem\T5\ For a between a'^ and aj-^, we have 



7*(a) = [d r-l]A-i 
5) For a between aj^-i and a'^, we have 
7*(a)==[d r-l]B-i 
where 



B -{k-ji- l)a) 



B-{k-j2 + l)a) 
B-{k-j2)a 



B 



C1O2 - 1) C2(j2 - 1) 
Cl(j2) C2(j2) 



Proof: [U The first part of the lemma follows from the 
property that /ij ~ 00 if j is an integral multiple of r 
(Lemma I23] part l4l) 

|2]i When a is equal to a\, the r + 1 lines Lj{a'^), for j = 
fr, + 1, . . . , + l)r, meet at the point by the previous 
lemma. Because Lj^{a'f) has the smallest slope magnitude, 
we have 

5jo("f) > 9]{a't) 

for all j G {£r, + 1, . . . , + l)r} \ {jo}. By Lemma |24] 
part (3), this happens when (ijo-i < < Q:j„. 

[3]l The value of the objective function at the point defined 
in the last lemma is (d + r — It is sufficient to show 

that Qg is indeed the optimal solution to the linear program 

in (Eg. 

We first show that Qg is feasible. We have seen in the last 
lemma that Qi satisfies the inequality in ( |24] | with equality 



for j = £r,£r + 1, ...,{£ + l)r. The point Qg also satisfies 
the inequality in (l23l l for j between £r and {£ + l)r. It is 
because the slope of Lj{a'i,) is more negative than the slope 
of L'j{a'i), and the point Qg is to the left of the intersection 
point of Lj{a'i) and Lj{a'g), namely Pj{a'g), in the 
plane. So, the point Qg is lying above the line L^ (a^), and 
satisfies the linear constraint associated with L'j{a'f). 

Now consider index j which is strictly less than £r. Since 
a'^ > Cij for j = 1, 2, . . . , — 1, we have 



9o{^'d < 9ir{a'g). 



The point Qg is located vertically above the point Pgr{a'g) in 
the P1-P2 plane. Therefore 

P,{&'g) ^ Pgria'g) ^ Qg. 

for j = 1,2, . . . ,£r — 1. This proves that the point Qg satisfies 
the constraint (|23] | and (|24] | for j < £r. Likewise, we can show 
that the point Qg satisfies and (|24]i for j > {£ + l)r. 

The optimality of the point Qg follows from Lemma 1211 by 
noting that the slope of Lgr{a'g) is infinity and the magnitude 
of the slope of Ljo(a^) is less than d/{r — 1). 

Toprove -(*{B/k) = B{d+r - I) l{k{d+r - k)), we check 
that 

d-fc + i±i d 
IJ.1 = < 



Hence, 



r - 1 



B 



1 



7*(S/fc)-7*(5o) = -i-(rf- 



■-l) = B 



- 1 



k{d + r - k)' 



|4]i Consider a which is within the range a'g < a < aj-^. Let 
Popt — iPi. opt, 1^2, opt) be the intersection point of Lj^{a) and 



L 



■ji+i 



(a), i.e. 



Pi, opt 
1^2, opt 



B~{k-n~ 

B-ik-ji, 



a 



We will apply Lemma ISTI and show that Popt{oi) is the optimal 
solution to the linear program ( |27] l for a G [a^ , a jo ) . By the 
definition of ji, the slope of Lj-^+i{a) has magnitude strictly 
larger than or equal to d/ [r — 1), and the slope of Lj^ (a) has 
magnitude less than or equal to d/{r — 1). Thus the second 
requirement in Lemma |2T] is satisfied. It remains to show that 
Popt is feasible. We need to check the constraints ( |23] l and ( l24b 
are satisfied for s = 1, 2, . . . , fc. 

Consider the vertical lines Lgr{a) and L(^g^i-j,,{a) in the 
P1-P2 plane. The /3i -coordinates of these two lines are re- 
spectively 



25^(0;) 



2g(f+i)r(a) = 2 



^ B-{k~ £r)a 
' £r{2d-2k + r + £ry 

B ~ {k - {£ + l)r)a 
{£ + l)r{2d -2k + r+{£+ l)r) ' 



(60) 



These two lines coincides when a = a'^. We readily check 
that 



k-£r 



> 



k - i£+l)r 



£r{2d -2k + r + £r) {£ + l)r{2d - 2k + r + {£ + l)r) 
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Thus gtr{ct) decreases faster than g^i+iyio) when a in- 
creases. The vertical line Lir{a) is to the left of the vertical 
line L(-^_|_i)r(a) when a > a'f. 

For s > {£ + l)r, the inequalities in ( l23T l and (l24l l for 
s > l)r is satisfied because Ps(c^) di P{i+i)r{o) di Popt- 
Similarly, for 1 < s < £r, the inequalities ( |23] ) and ( l24b for 
i = 1, 2, . . . , £r is satisfied because Ps{ct) r< ^'<'r(a) ^ Popt- 
In the following we will show that the constraints (|23] ) and ( |24] | 
for ir < s < {£ + l)r are satisfied by applying Lemma ISTI 
It is more convenient to change the coordinates such that 
L{i+i)r{o) is the vertical axis, and write s as {£ + l)r — z for 
z = l,2,...,r-l. 

Let /32,z(a) be the /32 -coordinate of the intersection point 
of L(^_|_i)r-z(a) and The value of /32.z(a) can be 

computed by solving the following linear system 

B = {k- {£+l)r + z)a + ci{{e + l)r - z)(3i 

B^{k-{£+ l)r)a + ci{{£ + l)r)pi. 

Using the equalities 

/3i = 2.g(£+i)r(a), 
C2((£ + l)r - z) = z{r - z) 
ci((£ + l)r -z) = {{£ + l)r - z){d - k) 

{£r + r - z)2 + + (r - z)^ 



ci((^+l)r) = (£+l)r(d-fc) + 



{£r + rf + £r'^ + 



we obtain the following expression for 132, z, 

a - 2<7(f+i)^(a)(d - k + {£ + l)r) 



r — z 

We check that the numerator in the fraction in the line above 
is positive: 



a > 2.g(£+i)r(a)(d ~ k + {£ + l)r) 
[B-{k- i£ + l)r)a 



^ a > 



• + l)r{2d - 2k + 2r + r£) 



{2d - 2k + 2{£ + l)r) 



After some algebraic manipulations, we can see that it is 
equivalent to 



a > 



B{d-k 



r£) 



k{d-k + r + r£)- r^£{£ + l)/2 



which is true by the assumption that a'^ < a < aj-^ . Thus, we 
have 

/32,i(a) > /32,2(a) > • • • > ^2.r-i(a). 
For any fixed a in this range, we also have 

P2.zia) - /32,z+i(a) 
= {a- g{i+i)r{a){d -k+{£ + l)r)) 

= (« - 5(£+i)r(a)(rf -k+{£+ l)r)) 



r — z — 1 
1 



{r — z){r — z — 1) 

We observe that /32,z(ck) — /32,z+i(ct) is an increasing function 
of z. 
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Fig. 16. Illustration of the notation used in the proof of Theorem 1151 The 
feasible region of the linear program is drawn as the shaded area. 



For z — 1, 2, . . . , r — 1, let P^{ct) be the intersection 
point of line ^(^+1)^+2 and line i(£_|_i)r+z+i- Let Xz{a) be 
the horizontal distance from point P!,{a) to the vertical line 
P{e+i)r{<3) in the /3i-/32 plane. (For example, P^ia) and Xzict) 
for z = 1, 2, 3 are illustrated in Fig. [16]) 

The slope of line .^^(£+1)^+2 (a), namely H{e+i)r+z^ is neg- 
ative with strictly decreasing magnitude as z increases from 1 
to — jo (by Lemma |23] part l3Tl . We have 

, X P2,z{a) - /32,2+i(a) 
Xz[a) = 

M(«+l)r-z-l ~ M(£+l)r-z 

for z — l,2,...,r — 1. We have seen that the numerator is 
an increasing function of z. Furthermore, the denominator is 
decreasing as z increases from 1 to r — jo- Hence, we obtain 

xi{a) < X2{a) < • • • < Xr-j„{(^)- 

By Lemma |20l we obtain the envelope of the lines 
-^(a+i)r-z(Q^) for z = 1,2, . . . ,r — jo. Because the point Popt 
is one of P^(a)'s, the point Popt is lying on the envelope of 
these lines. This impUes that Popt satisfies the constraints (|24| ) 
for s = + jo,^ + 1 + Jo + 1, • • . , + l)r - I. 

Now consider the constraints ( l23T l for s — £r + jo,£r + 
jo + 1, ...,{£ + l)r — 1. In this part of the proof, we need 
the fact that when a < ctj^ , the point Popt is lying above the 
line /3i = 2/?2 in the ^1-^2 plane, i.e., /3i,opt < 2/32, opt- If the 
/?! -coordinate of Ps{a) is less than or equal to f3i,opt, then 

Ps{a) < {/3i,opt,l3i,opt/2) d Popt- 

On the other hand, if the /3i -coordinate of Ps{a) is larger 
than Pi, opt, then the line Ls{a) is above the line L'^{a) for 
Pi = Pi, opt (cf. the argument as in the proof of Lemma [ 
In either case the point Popt satisfies the constraint in 

Finally, we show that P^pt satisfies (|23] l and ( |24] | for s — 
£r + l,£r + 2, . . . ,£r + jo - I. Because P2.S < P2,jo and the 
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slope is more negative than /ij^, the Hne Ls{a) is lying 
below the line Ljg{a) for /3i > g(i+i)ir{o)- As Popt is located 
on or above the line Ljg{a), we conclude that Popt is also 
located above the line Ls{a). Hence the constraint in (|24] | is 
satisfied for s — £r + l,£r + 2, . . . ,£r + jo — 1. Using the 
argument as in the previous paragraph, we can show that ( l23T l 
is also satisfied in this case. 

This completes the proof that Popt is feasible for a between 
a'g and dj-^ . By Lemma|20l we see that the point Popt is indeed 
the optimal solution to the linear program. 

|5]l For a which is within the range <3j„_i < a < d'^, it can 
be shown as in the previous case that the intersection point 
of Lj^{a) and Lj^^i{a) in the /3i-/32 plane is the optimal 
solution to the linear program (|27] |. The proof is analogous to 
the previous case and is omitted. ■ 
Proof of Theorem [75} We have shown in Lemma |22] that 
7* (a) equals to the constant {2d + r — \)/{k{2d + r -k)) for 
a > Qffe-i. Also, 7* (a) = oo for a < B/k. To find boundary 
of the region Clp, it suffices to consider a between B/k and 
ak-i = B{2d + r -l)/{k{2d + r ^ k)). 

For j = 1,2, let be the a-coordinate of OPj 

defined in the theorem, i.e., 

^ r^(2(d-fc + j)+r-l) if ^iJ>d/{r-l) 

l¥^(^-fc + KLjAJ +1)) if <d/{r-l) 

Using the notation introduced in the proof, we have — aj if 
IJ-j > (^/(^"l) ^nd — a'l^^y^j otherwise. Divide the interval 
[B/k,ak-i] into subintervals 

Note that £,1= B/k and = B{2d + r - 1) /{k{2d + r-k). 
From Lemma |22] and Lemma |25] the function 7* (a) is an 
affine function in each subinterval. The boundary of Clp is 
thus piece-wise linear, with vertices as defined in the theorem. 
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